Move to gawk-3.0.0.gawk-3.0.0

author: Arnold D. Robbins <arnold@skeeve.com> 2010-07-16 12:41:09 +0300
committer: Arnold D. Robbins <arnold@skeeve.com> 2010-07-16 12:41:09 +0300
commit: 8c042f99cc7465c86351d21331a129111b75345d (patch)
tree: 9656e653be0e42e5469cec77635c20356de152c2 /doc/gawk.texi
parent: 8ceb5f934787eb7be5fb452fb39179df66119954 (diff)
download: gawk-8c042f99cc7465c86351d21331a129111b75345d.tar.gz
1 files changed, 20460 insertions, 0 deletions
diff --git a/doc/gawk.texi b/doc/gawk.texi
new file mode 100644
index 00000000..6227ac32
--- /dev/null
+++ b/doc/gawk.texi
@@ -0,0 +1,20460 @@
+\input texinfo   @c -*-texinfo-*-
+@c %**start of header (This is for running Texinfo on a region.)
+@setfilename gawk.info
+@settitle AWK Language Programming
+@c %**end of header (This is for running Texinfo on a region.)
+
+@ignore
+@ifinfo
+@format
+START-INFO-DIR-ENTRY
+* Gawk: (gawk.info).           A Text Scanning and Processing Language.
+END-INFO-DIR-ENTRY
+@end format
+@end ifinfo
+@end ignore
+
+@c @set xref-automatic-section-title
+@c @set DRAFT
+
+@c The following information should be updated here only!
+@c This sets the edition of the document, the version of gawk it
+@c applies to, and when the document was updated.
+@set TITLE AWK Language Programming
+@set EDITION 1.0
+@set VERSION 3.0
+@set UPDATE-MONTH January 1996
+@iftex
+@set DOCUMENT book
+@end iftex
+@ifinfo
+@set DOCUMENT Info file
+@end ifinfo
+
+@ignore
+Some comments on the layout for TeX.
+1. Use the texinfo.tex from the gawk distribution. It contains fixes that
+   are needed to get the footings for draft mode to not appear.
+2. I have done A LOT of work to make this look good. There `@page' commands
+   and use of `@group ... @end group' in a number of places. If you muck
+   with anything, it's your responsibility not to break the layout.
+@end ignore
+
+@c merge the function and variable indexes into the concept index
+@ifinfo
+@synindex fn cp
+@synindex vr cp
+@end ifinfo
+@iftex
+@syncodeindex fn cp
+@syncodeindex vr cp
+@end iftex
+
+@c If "finalout" is commented out, the printed output will show
+@c black boxes that mark lines that are too long.  Thus, it is
+@c unwise to comment it out when running a master in case there are
+@c overfulls which are deemed okay.
+
+@ifclear DRAFT
+@iftex
+@finalout
+@end iftex
+@end ifclear
+
+@smallbook
+@iftex
+@cropmarks
+@end iftex
+
+@ifinfo
+This file documents @code{awk}, a program that you can use to select
+particular records in a file and perform operations upon them.
+
+This is Edition @value{EDITION} of @cite{@value{TITLE}},
+for the @value{VERSION} version of the GNU implementation of AWK.
+
+Copyright (C) 1989, 1991 - 1996 Free Software Foundation, Inc.
+
+Permission is granted to make and distribute verbatim copies of
+this manual provided the copyright notice and this permission notice
+are preserved on all copies.
+
+@ignore
+Permission is granted to process this file through TeX and print the
+results, provided the printed document carries copying permission
+notice identical to this one except for the removal of this paragraph
+(this paragraph not being relevant to the printed manual).
+
+@end ignore
+Permission is granted to copy and distribute modified versions of this
+manual under the conditions for verbatim copying, provided that the entire
+resulting derived work is distributed under the terms of a permission
+notice identical to this one.
+
+Permission is granted to copy and distribute translations of this manual
+into another language, under the above conditions for modified versions,
+except that this permission notice may be stated in a translation approved
+by the Foundation.
+@end ifinfo
+
+@setchapternewpage odd
+
+@titlepage
+@title @value{TITLE}
+@subtitle A User's Guide for GNU AWK
+@subtitle Edition @value{EDITION}
+@subtitle @value{UPDATE-MONTH}
+@author Arnold D. Robbins
+@sp
+@author Based on @cite{The GAWK Manual},
+@author by Robbins, Close, Rubin, and Stallman
+
+@c Include the Distribution inside the titlepage environment so
+@c that headings are turned off.  Headings on and off do not work.
+
+@page
+@vskip 0pt plus 1filll
+@ifset LEGALJUNK
+The programs and applications presented in this book have been
+included for their instructional value.  They have been tested with care,
+but are not guaranteed for any particular purpose.  The publisher does not
+offer any warranties or representations, nor does it accept any
+liabilities with respect to the programs or applications.
+So there.
+@sp 2
+UNIX is a registered trademark of X/Open, Ltd. @*
+Microsoft, MS, and MS-DOS are registered trademarks, and Windows is a
+trademark of Microsoft Corporation in the United States and other
+countries. @*
+Atari, 520ST, 1040ST, TT, STE, Mega, and Falcon are registered trademarks
+or trademarks of Atari Corporation. @*
+DEC, Digital, OpenVMS, ULTRIX, and VMS, are trademarks of Digital Equipment
+Corporation. @*
+@end ifset
+``To boldly go where no man has gone before'' is a
+Registered Trademark of Paramount Pictures Corporation. @*
+@c sorry, i couldn't resist
+@sp 3
+Copyright @copyright{} 1989, 1991 - 1996 Free Software Foundation, Inc.
+@sp 2
+        
+This is Edition @value{EDITION} of @cite{@value{TITLE}}, @*
+for the @value{VERSION} (or later) version of the GNU implementation of AWK.
+
+@sp 2
+Published by the Free Software Foundation @*
+59 Temple Place --- Suite 330 @*
+Boston, MA  02111-1307 USA @*
+Phone: +1-617-542-5942 @*
+Fax (including Japan): +1-617-542-2652 @*
+Printed copies are available for $25 each. @*
+@c this ISBN can change! Check with the FSF office...
+@c This one is correct for gawk 3.0 and edition 1.0
+ISBN 1-882114-26-4 @*
+
+Permission is granted to make and distribute verbatim copies of
+this manual provided the copyright notice and this permission notice
+are preserved on all copies.
+
+Permission is granted to copy and distribute modified versions of this
+manual under the conditions for verbatim copying, provided that the entire
+resulting derived work is distributed under the terms of a permission
+notice identical to this one.
+
+Permission is granted to copy and distribute translations of this manual
+into another language, under the above conditions for modified versions,
+except that this permission notice may be stated in a translation approved
+by the Foundation.
+@sp 2
+Cover art by Etienne Suvasa.
+@end titlepage
+
+@c Thanks to Bob Chassell for directions on doing dedications.
+@iftex
+@headings off
+@page
+@w{ }
+@sp 9
+@center @i{To Miriam, for making me complete.}
+@sp
+@center @i{To Chana, for the joy you bring us.}
+@sp
+@center @i{To Rivka, for the exponential increase.}
+@page
+@w{ }
+@page
+@headings on
+@end iftex
+
+@iftex
+@headings off
+@evenheading @thispage@ @ @ @b{@thistitle} @| @|
+@oddheading  @| @| @b{@thischapter}@ @ @ @thispage
+@ifset DRAFT
+@evenfooting @today{} @| @emph{DRAFT!} @| Please Do Not Redistribute
+@oddfooting Please Do Not Redistribute @| @emph{DRAFT!} @| @today{}
+@end ifset
+@end iftex
+
+@ifinfo
+@node Top, Preface, (dir), (dir)
+@top General Introduction
+@c Preface or Licensing nodes should come right after the Top
+@c node, in `unnumbered' sections, then the chapter, `What is gawk'.
+
+This file documents @code{awk}, a program that you can use to select
+particular records in a file and perform operations upon them.
+
+This is Edition @value{EDITION} of @cite{@value{TITLE}}, @*
+for the @value{VERSION} version of the GNU implementation @*
+of AWK.
+
+@end ifinfo
+
+@menu
+* Preface::                     What this @value{DOCUMENT} is about; brief
+                                history and acknowledgements.
+* What Is Awk::                 What is the @code{awk} language; using this
+                                @value{DOCUMENT}.
+* Getting Started::             A basic introduction to using @code{awk}. How
+                                to run an @code{awk} program. Command line
+                                syntax.
+* One-liners::                  Short, sample @code{awk} programs.
+* Regexp::                      All about matching things using regular
+                                expressions.
+* Reading Files::               How to read files and manipulate fields.
+* Printing::                    How to print using @code{awk}.  Describes the
+                                @code{print} and @code{printf} statements.  
+                                Also describes redirection of output.
+* Expressions::                 Expressions are the basic building blocks of
+                                statements.
+* Patterns and Actions::        Overviews of patterns and actions.
+* Statements::                  The various control statements are described
+                                in detail.
+* Built-in Variables::          Built-in Variables
+* Arrays::                      The description and use of arrays. Also
+                                includes array-oriented control statements.
+* Built-in::                    The built-in functions are summarized here.
+* User-defined::                User-defined functions are described in
+                                detail.
+* Invoking Gawk::               How to run @code{gawk}.
+* Library Functions::           A Library of @code{awk} Functions.
+* Sample Programs::             Many @code{awk} programs with complete
+                                explanations.
+* Language History::            The evolution of the @code{awk} language.
+* Gawk Summary::                @code{gawk} Options and Language Summary.
+* Installation::                Installing @code{gawk} under various operating
+                                systems.
+* Notes::                       Something about the implementation of
+                                @code{gawk}.
+* Glossary::                    An explanation of some unfamiliar terms.
+* Copying::                     Your right to copy and distribute @code{gawk}.
+* Index::                       Concept and Variable Index.
+
+* History::                     The history of @code{gawk} and @code{awk}.
+* Manual History::              Brief history of the GNU project and this
+                                @value{DOCUMENT}.
+* Acknowledgements::            Acknowledgements.
+* This Manual::                 Using this @value{DOCUMENT}. Includes sample
+                                input files that you can use.
+* Conventions::                 Typographical Conventions.
+* Sample Data Files::           Sample data files for use in the @code{awk}
+                                programs illustrated in this @value{DOCUMENT}.
+* Names::                       What name to use to find @code{awk}.
+* Running gawk::                How to run @code{gawk} programs; includes
+                                command line syntax.
+* One-shot::                    Running a short throw-away @code{awk} program.
+* Read Terminal::               Using no input files (input from terminal
+                                instead).
+* Long::                        Putting permanent @code{awk} programs in
+                                files.
+* Executable Scripts::          Making self-contained @code{awk} programs.
+* Comments::                    Adding documentation to @code{gawk} programs.
+* Very Simple::                 A very simple example.
+* Two Rules::                   A less simple one-line example with two rules.
+* More Complex::                A more complex example.
+* Statements/Lines::            Subdividing or combining statements into
+                                lines.
+* Other Features::              Other Features of @code{awk}.
+* When::                        When to use @code{gawk} and when to use other
+                                things.
+* Regexp Usage::                How to Use Regular Expressions.
+* Escape Sequences::            How to write non-printing characters.
+* Regexp Operators::            Regular Expression Operators.
+* GNU Regexp Operators::        Operators specific to GNU software.
+* Case-sensitivity::            How to do case-insensitive matching.
+* Leftmost Longest::            How much text matches.
+* Computed Regexps::            Using Dynamic Regexps.
+* Records::                     Controlling how data is split into records.
+* Fields::                      An introduction to fields.
+* Non-Constant Fields::         Non-constant Field Numbers.
+* Changing Fields::             Changing the Contents of a Field.
+* Field Separators::            The field separator and how to change it.
+* Basic Field Splitting::       How fields are split with single characters or
+                                simple strings.
+* Regexp Field Splitting::      Using regexps as the field separator.
+* Single Character Fields::     Making each character a separate field.
+* Command Line Field Separator:: Setting @code{FS} from the command line.
+* Field Splitting Summary::     Some final points and a summary table.
+* Constant Size::               Reading constant width data.
+* Multiple Line::               Reading multi-line records.
+* Getline::                     Reading files under explicit program control
+                                using the @code{getline} function.
+* Getline Intro::               Introduction to the @code{getline} function.
+* Plain Getline::               Using @code{getline} with no arguments.
+* Getline/Variable::            Using @code{getline} into a variable.
+* Getline/File::                Using @code{getline} from a file.
+* Getline/Variable/File::       Using @code{getline} into a variable from a
+                                file.
+* Getline/Pipe::                Using @code{getline} from a pipe.
+* Getline/Variable/Pipe::       Using @code{getline} into a variable from a
+                                pipe.
+* Getline Summary::             Summary Of @code{getline} Variants.
+* Print::                       The @code{print} statement.
+* Print Examples::              Simple examples of @code{print} statements.
+* Output Separators::           The output separators and how to change them.
+* OFMT::                        Controlling Numeric Output With @code{print}.
+* Printf::                      The @code{printf} statement.
+* Basic Printf::                Syntax of the @code{printf} statement.
+* Control Letters::             Format-control letters.
+* Format Modifiers::            Format-specification modifiers.
+* Printf Examples::             Several examples.
+* Redirection::                 How to redirect output to multiple files and
+                                pipes.
+* Special Files::               File name interpretation in @code{gawk}.
+                                @code{gawk} allows access to inherited file
+                                descriptors.
+* Close Files And Pipes::       Closing Input and Output Files and Pipes.
+* Constants::                   String, numeric, and regexp constants.
+* Scalar Constants::            Numeric and string constants.
+* Regexp Constants::            Regular Expression constants.
+* Using Constant Regexps::      When and how to use a regexp constant.
+* Variables::                   Variables give names to values for later use.
+* Using Variables::             Using variables in your programs.
+* Assignment Options::          Setting variables on the command line and a
+                                summary of command line syntax. This is an
+                                advanced method of input.
+* Conversion::                  The conversion of strings to numbers and vice
+                                versa.
+* Arithmetic Ops::              Arithmetic operations (@samp{+}, @samp{-},
+                                etc.)
+* Concatenation::               Concatenating strings.
+* Assignment Ops::              Changing the value of a variable or a field.
+* Increment Ops::               Incrementing the numeric value of a variable.
+* Truth Values::                What is ``true'' and what is ``false''.
+* Typing and Comparison::       How variables acquire types, and how this
+                                affects comparison of numbers and strings with
+                                @samp{<}, etc.
+* Boolean Ops::                 Combining comparison expressions using boolean
+                                operators @samp{||} (``or''), @samp{&&}
+                                (``and'') and @samp{!} (``not'').
+* Conditional Exp::             Conditional expressions select between two
+                                subexpressions under control of a third
+                                subexpression.
+* Function Calls::              A function call is an expression.
+* Precedence::                  How various operators nest.
+* Pattern Overview::            What goes into a pattern.
+* Kinds of Patterns::           A list of all kinds of patterns.
+* Regexp Patterns::             Using regexps as patterns.
+* Expression Patterns::         Any expression can be used as a pattern.
+* Ranges::                      Pairs of patterns specify record ranges.
+* BEGIN/END::                   Specifying initialization and cleanup rules.
+* Using BEGIN/END::             How and why to use BEGIN/END rules.
+* I/O And BEGIN/END::           I/O issues in BEGIN/END rules.
+* Empty::                       The empty pattern, which matches every record.
+* Action Overview::             What goes into an action.
+* If Statement::                Conditionally execute some @code{awk}
+                                statements.
+* While Statement::             Loop until some condition is satisfied.
+* Do Statement::                Do specified action while looping until some
+                                condition is satisfied.
+* For Statement::               Another looping statement, that provides
+                                initialization and increment clauses.
+* Break Statement::             Immediately exit the innermost enclosing loop.
+* Continue Statement::          Skip to the end of the innermost enclosing
+                                loop.
+* Next Statement::              Stop processing the current input record.
+* Nextfile Statement::          Stop processing the current file.
+* Exit Statement::              Stop execution of @code{awk}.
+* User-modified::               Built-in variables that you change to control
+                                @code{awk}.
+* Auto-set::                    Built-in variables where @code{awk} gives you
+                                information.
+* ARGC and ARGV::               Ways to use @code{ARGC} and @code{ARGV}.
+* Array Intro::                 Introduction to Arrays
+* Reference to Elements::       How to examine one element of an array.
+* Assigning Elements::          How to change an element of an array.
+* Array Example::               Basic Example of an Array
+* Scanning an Array::           A variation of the @code{for} statement. It
+                                loops through the indices of an array's
+                                existing elements.
+* Delete::                      The @code{delete} statement removes an element
+                                from an array.
+* Numeric Array Subscripts::    How to use numbers as subscripts in
+                                @code{awk}.
+* Uninitialized Subscripts::    Using Uninitialized variables as subscripts.
+* Multi-dimensional::           Emulating multi-dimensional arrays in
+                                @code{awk}.
+* Multi-scanning::              Scanning multi-dimensional arrays.
+* Calling Built-in::            How to call built-in functions.
+* Numeric Functions::           Functions that work with numbers, including
+                                @code{int}, @code{sin} and @code{rand}.
+* String Functions::            Functions for string manipulation, such as
+                                @code{split}, @code{match}, and
+                                @code{sprintf}.
+* I/O Functions::               Functions for files and shell commands.
+* Time Functions::              Functions for dealing with time stamps.
+* Definition Syntax::           How to write definitions and what they mean.
+* Function Example::            An example function definition and what it
+                                does.
+* Function Caveats::            Things to watch out for.
+* Return Statement::            Specifying the value a function returns.
+* Options::                     Command line options and their meanings.
+* Other Arguments::             Input file names and variable assignments.
+* AWKPATH Variable::            Searching directories for @code{awk} programs.
+* Obsolete::                    Obsolete Options and/or features.
+* Undocumented::                Undocumented Options and Features.
+* Known Bugs::                  Known Bugs in @code{gawk}.
+* Portability Notes::           What to do if you don't have @code{gawk}.
+* Nextfile Function::           Two implementations of a @code{nextfile}
+                                function.
+* Assert Function::             A function for assertions in @code{awk}
+                                programs.
+* Ordinal Functions::           Functions for using characters as numbers and
+                                vice versa.
+* Join Function::               A function to join an array into a string.
+* Mktime Function::             A function to turn a date into a timestamp.
+* Gettimeofday Function::       A function to get formatted times.
+* Filetrans Function::          A function for handling data file transitions.
+* Getopt Function::             A function for processing command line
+                                arguments.
+* Passwd Functions::            Functions for getting user information.
+* Group Functions::             Functions for getting group information.
+* Library Names::               How to best name private global variables in
+                                library functions.
+* Clones::                      Clones of common utilities.
+* Cut Program::                 The @code{cut} utility.
+* Egrep Program::               The @code{egrep} utility.
+* Id Program::                  The @code{id} utility.
+* Split Program::               The @code{split} utility.
+* Tee Program::                 The @code{tee} utility.
+* Uniq Program::                The @code{uniq} utility.
+* Wc Program::                  The @code{wc} utility.
+* Miscellaneous Programs::      Some interesting @code{awk} programs.
+* Dupword Program::             Finding duplicated words in a document.
+* Alarm Program::               An alarm clock.
+* Translate Program::           A program similar to the @code{tr} utility.
+* Labels Program::              Printing mailing labels.
+* Word Sorting::                A program to produce a word usage count.
+* History Sorting::             Eliminating duplicate entries from a history
+                                file.
+* Extract Program::             Pulling out programs from Texinfo source
+                                files.
+* Simple Sed::                  A Simple Stream Editor.
+* Igawk Program::               A wrapper for @code{awk} that includes files.
+* V7/SVR3.1::                   The major changes between V7 and System V
+                                Release 3.1.
+* SVR4::                        Minor changes between System V Releases 3.1
+                                and 4.
+* POSIX::                       New features from the POSIX standard.
+* BTL::                         New features from the AT&T Bell Laboratories
+                                version of @code{awk}.
+* POSIX/GNU::                   The extensions in @code{gawk} not in POSIX
+                                @code{awk}.
+* Command Line Summary::        Recapitulation of the command line.
+* Language Summary::            A terse review of the language.
+* Variables/Fields::            Variables, fields, and arrays.
+* Fields Summary::              Input field splitting.
+* Built-in Summary::            @code{awk}'s built-in variables.
+* Arrays Summary::              Using arrays.
+* Data Type Summary::           Values in @code{awk} are numbers or strings.
+* Rules Summary::               Patterns and Actions, and their component
+                                parts.
+* Pattern Summary::             Quick overview of patterns.
+* Regexp Summary::              Quick overview of regular expressions.
+* Actions Summary::             Quick overview of actions.
+* Operator Summary::            @code{awk} operators.
+* Control Flow Summary::        The control statements.
+* I/O Summary::                 The I/O statements.
+* Printf Summary::              A summary of @code{printf}.
+* Special File Summary::        Special file names interpreted internally.
+* Built-in Functions Summary::  Built-in numeric and string functions.
+* Time Functions Summary::      Built-in time functions.
+* String Constants Summary::    Escape sequences in strings.
+* Functions Summary::           Defining and calling functions.
+* Historical Features::         Some undocumented but supported ``features''.
+* Gawk Distribution::           What is in the @code{gawk} distribution.
+* Getting::                     How to get the distribution.
+* Extracting::                  How to extract the distribution.
+* Distribution contents::       What is in the distribution.
+* Unix Installation::           Installing @code{gawk} under various versions
+                                of Unix.
+* Quick Installation::          Compiling @code{gawk} under Unix.
+* Configuration Philosophy::    How it's all supposed to work.
+* VMS Installation::            Installing @code{gawk} on VMS.
+* VMS Compilation::             How to compile @code{gawk} under VMS.
+* VMS Installation Details::    How to install @code{gawk} under VMS.
+* VMS Running::                 How to run @code{gawk} under VMS.
+* VMS POSIX::                   Alternate instructions for VMS POSIX.
+* PC Installation::             Installing and Compiling @code{gawk} on MS-DOS
+                                and OS/2
+* Atari Installation::          Installing @code{gawk} on the Atari ST.
+* Atari Compiling::             Compiling @code{gawk} on Atari
+* Atari Using::                 Running @code{gawk} on Atari
+* Amiga Installation::          Installing @code{gawk} on an Amiga.
+* Bugs::                        Reporting Problems and Bugs.
+* Other Versions::              Other freely available @code{awk}
+                                implementations.
+* Compatibility Mode::          How to disable certain @code{gawk} extensions.
+* Additions::                   Making Additions To @code{gawk}.
+* Adding Code::                 Adding code to the main body of @code{gawk}.
+* New Ports::                   Porting @code{gawk} to a new operating system.
+* Future Extensions::           New features that may be implemented one day.
+* Improvements::                Suggestions for improvements by volunteers.
+
+@end menu
+
+@c dedication for Info file
+@ifinfo
+@center To Miriam, for making me complete.
+@sp 1
+@center To Chana, for the joy you bring us.
+@sp 1
+@center To Rivka, for the exponential increase.
+@end ifinfo
+
+@node Preface, What Is Awk, Top, Top
+@unnumbered Preface
+
+@c I saw a comment somewhere that the preface should describe the book itself,
+@c and the introduction should describe what the book covers.
+
+This @value{DOCUMENT} teaches you about the @code{awk} language and
+how you can use it effectively.  You should already be familiar with basic
+system commands, such as @code{cat} and @code{ls},@footnote{These commands
+are available on POSIX compliant systems, as well as on traditional Unix
+based systems. If you are using some other operating system, you still need to
+be familiar with the ideas of I/O redirection and pipes} and basic shell
+facilities, such as Input/Output (I/O) redirection and pipes.
+
+Implementations of the @code{awk} language are available for many different
+computing environments.  This @value{DOCUMENT}, while describing the @code{awk} language
+in general, also describes a particular implementation of @code{awk} called
+@code{gawk} (which stands for ``GNU Awk'').  @code{gawk} runs on a broad range
+of Unix systems, ranging from 80386 PC-based computers, up through large scale
+systems, such as Crays. @code{gawk} has also been ported to MS-DOS and
+OS/2 PC's, Atari and Amiga micro-computers, and VMS.
+
+@menu
+* History::                     The history of @code{gawk} and @code{awk}.
+* Manual History::              Brief history of the GNU project and this
+                                @value{DOCUMENT}.
+* Acknowledgements::            Acknowledgements.
+@end menu
+
+@node History, Manual History, Preface, Preface
+@unnumberedsec History of @code{awk} and @code{gawk}
+
+@cindex acronym
+@cindex history of @code{awk}
+@cindex Aho, Alfred
+@cindex Weinberger, Peter
+@cindex Kernighan, Brian
+@cindex old @code{awk}
+@cindex new @code{awk}
+The name @code{awk} comes from the initials of its designers: Alfred V.@:
+Aho, Peter J.@: Weinberger, and Brian W.@: Kernighan.  The original version of
+@code{awk} was written in 1977 at AT&T Bell Laboratories.
+In 1985 a new version made the programming
+language more powerful, introducing user-defined functions, multiple input
+streams, and computed regular expressions.
+This new version became generally available with Unix System V Release 3.1.
+The version in System V Release 4 added some new features and also cleaned
+up the behavior in some of the ``dark corners'' of the language.
+The specification for @code{awk} in the POSIX Command Language
+and Utilities standard further clarified the language based on feedback
+from both the @code{gawk} designers, and the original Bell Labs @code{awk}
+designers.
+
+The GNU implementation, @code{gawk}, was written in 1986 by Paul Rubin
+and Jay Fenlason, with advice from Richard Stallman.  John Woods
+contributed parts of the code as well.  In 1988 and 1989, David Trueman, with
+help from Arnold Robbins, thoroughly reworked @code{gawk} for compatibility
+with the newer @code{awk}.  Current development focuses on bug fixes,
+performance improvements, standards compliance, and occasionally, new features.
+
+@node Manual History, Acknowledgements, History, Preface
+@unnumberedsec The GNU Project and This Book
+
+@cindex Free Software Foundation
+The Free Software Foundation (FSF) is a non-profit organization dedicated
+to the production and distribution of freely distributable software.
+It was founded by Richard M.@: Stallman, the author of the original
+Emacs editor.  GNU Emacs is the most widely used version of Emacs today.
+
+@cindex GNU Project
+The GNU project is an on-going effort on the part of the Free Software
+Foundation to create a complete, freely distributable, POSIX compliant
+computing environment.  (GNU stands for ``GNU's not Unix''.)
+The FSF uses the ``GNU General Public License'' (or GPL) to ensure that
+source code for their software is always available to the end user. A
+copy of the GPL is included for your reference
+(@pxref{Copying, ,GNU GENERAL PUBLIC LICENSE}).
+The GPL applies to the C language source code for @code{gawk}.
+
+As of this writing (1995), the only major component of the
+GNU environment still uncompleted is the operating system kernel, and
+work proceeds apace on that.  A shell, an editor (Emacs), highly portable
+optimizing C, C++, and Objective-C compilers, a symbolic debugger, and dozens
+of large and small utilities (such as @code{gawk}),
+have all been completed and are freely available.
+
+@cindex Linux
+@cindex NetBSD
+@cindex FreeBSD
+Until the GNU operating system is released, the FSF recommends the use
+of Linux, a freely distributable, Unix-like operating system for 80386
+and other systems.  There are many books on Linux. One freely available one
+is @cite{Linux Installation and Getting Started}, by Matt Welsh.
+Many Linux distributions are available, often in computer stores or
+bundled on CD-ROM with books about Linux. Also, the FSF provides a Linux
+distribution (``Debian''); contact them for more information.
+@xref{Getting, ,Getting the @code{gawk} Distribution}, for the FSF's contact
+information.
+(There are two other freely available, Unix-like operating systems for
+80386 and other systems, NetBSD and FreeBSD. Both are based on the
+4.4-Lite Berkeley Software Distribution, and both use recent versions
+of @code{gawk} for their versions of @code{awk}.)
+
+@iftex
+This @value{DOCUMENT} you are reading now is actually free.  The
+information in it is freely available to anyone, the machine readable
+source code for the @value{DOCUMENT} comes with @code{gawk}, and anyone
+may take this @value{DOCUMENT} to a copying machine and make as many
+copies of it as they like.  (Take a moment to check the copying
+permissions on the Copyright page.)
+
+If you paid money for this @value{DOCUMENT}, what you actually paid for
+was the @value{DOCUMENT}'s nice printing and binding, and the
+publisher's associated costs to produce it.  We have made an effort to
+keep these costs reasonable; most people would prefer a bound book to
+over 300 pages of photo-copied text that would then have to be held in
+a loose-leaf binder (not to mention the time and labor involved in
+doing the copying).  The same is true of producing this
+@value{DOCUMENT} from the machine readable source; the retail price is
+only slightly more than the cost per page of printing it
+on a laser printer.
+@end iftex
+
+This @value{DOCUMENT} itself has gone through several previous,
+preliminary editions.  I started working on a preliminary draft of
+@cite{The GAWK Manual}, by Diane Close, Paul Rubin, and Richard
+Stallman in the fall of 1988.
+It was around 90 pages long, and barely described the original, ``old''
+version of @code{awk}. After substantial revision, the first version of
+the @cite{The GAWK Manual} to be released was Edition 0.11 Beta in
+October of 1989.  The manual then underwent more substantial revision
+for Edition 0.13 of December 1991.
+David Trueman, Pat Rankin, and Michal Jaegermann contributed sections
+of the manual for Edition 0.13.
+That edition was published by the
+FSF as a bound book early in 1992.  Since then there have been several
+minor revisions, notably Edition 0.14 of November 1992 that was published
+by the FSF in January of 1993, and Edition 0.16 of August 1993.
+
+Edition 1.0 of @cite{@value{TITLE}} represents a significant re-working
+of @cite{The GAWK Manual}, with much additional material.
+The FSF and I agree that I am now the primary author.
+I also felt that it needed a more descriptive title.
+
+@cite{@value{TITLE}} will undoubtedly continue to evolve.
+An electronic version
+comes with the @code{gawk} distribution from the FSF.
+If you find an error in this @value{DOCUMENT}, please report it!
+@xref{Bugs, ,Reporting Problems and Bugs}, for information on submitting
+problem reports electronically, or write to me in care of the FSF.
+
+@node Acknowledgements, , Manual History, Preface
+@unnumberedsec Acknowledgements
+
+I would like to acknowledge Richard M.@: Stallman, for his vision of a
+better world, and for his courage in founding the FSF and starting the
+GNU project.
+
+The initial draft of @cite{The GAWK Manual} had the following acknowledgements:
+
+@quotation
+Many people need to be thanked for their assistance in producing this
+manual.  Jay Fenlason contributed many ideas and sample programs.  Richard
+Mlynarik and Robert Chassell gave helpful comments on drafts of this
+manual.  The paper @cite{A Supplemental Document for @code{awk}} by John W.@:
+Pierce of the Chemistry Department at UC San Diego, pinpointed several
+issues relevant both to @code{awk} implementation and to this manual, that
+would otherwise have escaped us.
+@end quotation
+
+The following people provided many helpful comments on Edition 0.13 of
+@cite{The GAWK Manual}: Rick Adams, Michael Brennan, Rich Burridge, Diane Close,
+Christopher (``Topher'') Eliot, Michael Lijewski, Pat Rankin, Miriam Robbins,
+and Michal Jaegermann.
+
+The following people provided many helpful comments for Edition 1.0 of
+@cite{@value{TITLE}}: Karl Berry, Michael Brennan, Darrel
+Hankerson, Michal Jaegermann, Michael Lijewski, and Miriam Robbins.
+Pat Rankin, Michal Jaegermann, Darrel Hankerson and Scott Deifik
+updated their respective sections for Edition 1.0.
+
+Robert J.@: Chassell provided much valuable advice on
+the use of Texinfo.  He also deserves special thanks for
+convincing me @emph{not} to title this @value{DOCUMENT}
+@cite{How To Gawk Politely}.
+Karl Berry helped significantly with the @TeX{} part of Texinfo.
+
+@cindex Trueman, David
+David Trueman deserves special credit; he has done a yeoman job
+of evolving @code{gawk} so that it performs well, and without bugs.
+Although he is no longer involved with @code{gawk},
+working with him on this project was a significant pleasure.
+
+@cindex Deifik, Scott
+@cindex Hankerson, Darrel
+@cindex Rommel, Kai Uwe
+@cindex Rankin, Pat
+@cindex Jaegermann, Michal
+Scott Deifik, Darrel Hankerson, Kai Uwe Rommel, Pat Rankin, and Michal
+Jaegermann (in no particular order) are long time members of the
+@code{gawk} ``crack portability team.''  Without their hard work and
+help, @code{gawk} would not be nearly the fine program it is today.  It
+has been and continues to be a pleasure working with this team of fine
+people.
+
+@cindex Friedl, Jeffrey
+Jeffrey Friedl provided invaluable help in tracking down a number
+of last minute problems with regular expressions in @code{gawk} 3.0.
+
+@cindex Kernighan, Brian
+David and I would like to thank Brian Kernighan of Bell Labs for
+invaluable assistance during the testing and debugging of @code{gawk}, and for
+help in clarifying numerous points about the language.  We could not have
+done nearly as good a job on either @code{gawk} or its documentation without
+his help.
+
+@cindex Hughes, Phil
+I would like to thank Marshall and Elaine Hartholz of Seattle, and Dr.@:
+Bert and Rita Schreiber of Detroit for large amounts of quiet vacation
+time in their homes, which allowed me to make significant progress on
+this @value{DOCUMENT} and on @code{gawk} itself.  Phil Hughes of SSC
+contributed in a very important way by loaning me his laptop Linux
+system, not once, but twice, allowing me to do a lot of work while
+away from home.
+
+@cindex Robbins, Miriam
+Finally, I must thank my wonderful wife, Miriam, for her patience through
+the many versions of this project, for her proof-reading,
+and for sharing me with the computer.
+I would like to thank my parents for their love, and for the grace with
+which they raised and educated me.
+I also must acknowledge my gratitude to G-d, for the many opportunities
+He has sent my way, as well as for the gifts He has given me with which to
+take advantage of those opportunities.
+@sp 2
+@noindent
+Arnold Robbins @*
+Atlanta, Georgia @*
+January, 1996
+
+@ignore
+Stuff still not covered anywhere:
+BASICS:
+   Integer vs. floating point
+   Hex vs. octal vs. decimal
+   Interpreter vs compiler
+   input/output
+@end ignore
+
+@node What Is Awk, Getting Started, Preface, Top
+@chapter Introduction
+
+If you are like many computer users, you would frequently like to make
+changes in various text files wherever certain patterns appear, or
+extract data from parts of certain lines while discarding the rest.  To
+write a program to do this in a language such as C or Pascal is a
+time-consuming inconvenience that may take many lines of code.  The job
+may be easier with @code{awk}.
+
+The @code{awk} utility interprets a special-purpose programming language
+that makes it possible to handle simple data-reformatting jobs
+with just a few lines of code.
+
+The GNU implementation of @code{awk} is called @code{gawk}; it is fully
+upward compatible with the System V Release 4 version of
+@code{awk}.  @code{gawk} is also upward compatible with the POSIX
+specification of the @code{awk} language.  This means that all
+properly written @code{awk} programs should work with @code{gawk}.
+Thus, we usually don't distinguish between @code{gawk} and other @code{awk}
+implementations.
+
+@cindex uses of @code{awk}
+Using @code{awk} you can:
+
+@itemize @bullet
+@item
+manage small, personal databases
+
+@item
+generate reports
+
+@item
+validate data
+
+@item
+produce indexes, and perform other document preparation tasks
+
+@item
+even experiment with algorithms that can be adapted later to other computer
+languages
+@end itemize
+
+@menu
+* This Manual::                 Using this @value{DOCUMENT}. Includes sample
+                                input files that you can use.
+* Conventions::                 Typographical Conventions.
+* Sample Data Files::           Sample data files for use in the @code{awk} 
+                                programs illustrated in this @value{DOCUMENT}.
+@end menu
+
+@node This Manual, Conventions, What Is Awk, What Is Awk
+@section Using This Book
+@cindex book, using this
+@cindex using this book
+@cindex language, @code{awk}
+@cindex program, @code{awk}
+@ignore
+@cindex @code{awk} language
+@cindex @code{awk} program
+@end ignore
+
+The term @code{awk} refers to a particular program, and to the language you
+use to tell this program what to do.  When we need to be careful, we call
+the program ``the @code{awk} utility'' and the language ``the @code{awk}
+language.''  The term @code{gawk} refers to a version of @code{awk} developed
+as part the GNU project.  The purpose of this @value{DOCUMENT} is to explain
+both the @code{awk} language and how to run the @code{awk} utility.
+
+The main purpose of the @value{DOCUMENT} is to explain the features
+of @code{awk}, as defined in the POSIX standard.  It does so in the context
+of one particular implementation, @code{gawk}. While doing so, it will also
+attempt to describe important differences between @code{gawk} and other
+@code{awk} implementations.  Finally, any @code{gawk} features that
+are not in the POSIX standard for @code{awk} will be noted.
+
+@iftex
+This @value{DOCUMENT} has the difficult task of being both tutorial and reference.
+If you are a novice, feel free to skip over details that seem too complex.
+You should also ignore the many cross references; they are for the
+expert user, and for the on-line Info version of the document.
+@end iftex
+
+The term @dfn{@code{awk} program} refers to a program written by you in
+the @code{awk} programming language.
+
+@xref{Getting Started, ,Getting Started with @code{awk}}, for the bare
+essentials you need to know to start using @code{awk}.  
+
+Some useful ``one-liners'' are included to give you a feel for the
+@code{awk} language (@pxref{One-liners, ,Useful One Line Programs}).
+
+Many sample @code{awk} programs have been provided for you
+(@pxref{Library Functions, ,A Library of @code{awk} Functions}; also
+@pxref{Sample Programs, ,Practical @code{awk} Programs}).
+
+The entire @code{awk} language is summarized for quick reference in
+@ref{Gawk Summary, ,@code{gawk} Summary}.  Look there if you just need
+to refresh your memory about a particular feature.
+
+If you find terms that you aren't familiar with, try looking them
+up in the glossary (@pxref{Glossary}).
+
+Most of the time complete @code{awk} programs are used as examples, but in
+some of the more advanced sections, only the part of the @code{awk} program
+that illustrates the concept being described is shown.
+
+While this @value{DOCUMENT} is aimed principally at people who have not been
+exposed
+to @code{awk}, there is a lot of information here that even the @code{awk}
+expert should find useful.  In particular, the description of POSIX
+@code{awk}, and the example programs in
+@ref{Library Functions, ,A Library of @code{awk} Functions}, and
+@ref{Sample Programs, ,Practical @code{awk} Programs},
+should be of interest.
+
+@c fakenode --- for prepinfo
+@unnumberedsubsec Dark Corners
+
+@cindex d.c., see ``dark corner''
+@cindex dark corner
+Until the POSIX standard (and @cite{The Gawk Manual}),
+many features of @code{awk} were either poorly documented, or not
+documented at all.  Descriptions of such features
+(often called ``dark corners'') are noted in this @value{DOCUMENT} with
+``(d.c.)''.
+They also appear in the index under the heading ``dark corner.''
+
+@node Conventions, Sample Data Files, This Manual, What Is Awk
+@section Typographical Conventions
+
+This @value{DOCUMENT} is written using Texinfo, the GNU documentation formatting language.
+A single Texinfo source file is used to produce both the printed and on-line
+versions of the documentation.
+@iftex
+Because of this, the typographical conventions
+are slightly different than in other books you may have read.
+@end iftex
+@ifinfo
+This section briefly documents the typographical conventions used in Texinfo.
+@end ifinfo
+
+Examples you would type at the command line are preceded by the common
+shell primary and secondary prompts, @samp{$} and @samp{>}.
+Output from the command is preceded by the glyph ``@print{}''.
+This typically represents the command's standard output.
+Error messages, and other output on the command's standard error, are preceded
+by the glyph ``@error{}''.  For example:
+
+@example
+$ echo hi on stdout
+@print{} hi on stdout
+$ echo hello on stderr 1>&2
+@error{} hello on stderr
+@end example
+
+@iftex
+In the text, command names appear in @code{this font}, while code segments
+appear in the same font and quoted, @samp{like this}.  Some things will
+be emphasized @emph{like this}, and if a point needs to be made
+strongly, it will be done @strong{like this}.  The first occurrence of
+a new term is usually its @dfn{definition}, and appears in the same
+font as the previous occurrence of ``definition'' in this sentence.
+File names are indicated like this: @file{/path/to/ourfile}.
+@end iftex
+
+Characters that you type at the keyboard look @kbd{like this}.  In particular,
+there are special characters called ``control characters.''  These are
+characters that you type by holding down both the @kbd{CONTROL} key and
+another key, at the same time.  For example, a @kbd{Control-d} is typed
+by first pressing and holding the @kbd{CONTROL} key, next
+pressing the @kbd{d} key, and finally releasing both keys.
+
+@node Sample Data Files,  , Conventions, What Is Awk
+@section Data Files for the Examples
+
+@cindex input file, sample
+@cindex sample input file
+@cindex @file{BBS-list} file
+Many of the examples in this @value{DOCUMENT} take their input from two sample
+data files.  The first, called @file{BBS-list}, represents a list of
+computer bulletin board systems together with information about those systems.
+The second data file, called @file{inventory-shipped}, contains
+information about shipments on a monthly basis.  In both files,
+each line is considered to be one @dfn{record}.
+
+In the file @file{BBS-list}, each record contains the name of a computer
+bulletin board, its phone number, the board's baud rate(s), and a code for
+the number of hours it is operational.  An @samp{A} in the last column
+means the board operates 24 hours a day.  A @samp{B} in the last
+column means the board operates evening and weekend hours, only.  A
+@samp{C} means the board operates only on weekends.
+
+@c 2e: Update the baud rates to reflect today's faster modems
+@example
+@c system mkdir eg
+@c system mkdir eg/lib
+@c system mkdir eg/data
+@c system mkdir eg/prog
+@c system mkdir eg/misc
+@c file eg/data/BBS-list
+aardvark     555-5553     1200/300          B
+alpo-net     555-3412     2400/1200/300     A
+barfly       555-7685     1200/300          A
+bites        555-1675     2400/1200/300     A
+camelot      555-0542     300               C
+core         555-2912     1200/300          C
+fooey        555-1234     2400/1200/300     B
+foot         555-6699     1200/300          B
+macfoo       555-6480     1200/300          A
+sdace        555-3430     2400/1200/300     A
+sabafoo      555-2127     1200/300          C
+@c endfile
+@end example
+
+@cindex @file{inventory-shipped} file
+The second data file, called @file{inventory-shipped}, represents
+information about shipments during the year.  
+Each record contains the month of the year, the number
+of green crates shipped, the number of red boxes shipped, the number of
+orange bags shipped, and the number of blue packages shipped,
+respectively.  There are 16 entries, covering the 12 months of one year
+and four months of the next year.
+
+@example
+@c file eg/data/inventory-shipped
+Jan  13  25  15 115
+Feb  15  32  24 226
+Mar  15  24  34 228
+Apr  31  52  63 420
+May  16  34  29 208
+Jun  31  42  75 492
+Jul  24  34  67 436
+Aug  15  34  47 316
+Sep  13  55  37 277
+Oct  29  54  68 525
+Nov  20  87  82 577
+Dec  17  35  61 401
+
+Jan  21  36  64 620
+Feb  26  58  80 652
+Mar  24  75  70 495
+Apr  21  70  74 514
+@c endfile
+@end example
+
+@ifinfo
+If you are reading this in GNU Emacs using Info, you can copy the regions
+of text showing these sample files into your own test files.  This way you
+can try out the examples shown in the remainder of this document.  You do
+this by using the command @kbd{M-x write-region} to copy text from the Info
+file into a file for use with @code{awk}
+(@xref{Misc File Ops, , Miscellaneous File Operations, emacs, GNU Emacs Manual},
+for more information).  Using this information, create your own
+@file{BBS-list} and @file{inventory-shipped} files, and practice what you
+learn in this @value{DOCUMENT}.
+
+If you are using the stand-alone version of Info,
+see @ref{Extract Program, ,Extracting Programs from Texinfo Source Files},
+for an @code{awk} program that will extract these data files from
+@file{gawk.texi}, the Texinfo source file for this Info file.
+@end ifinfo
+
+@node Getting Started, One-liners, What Is Awk, Top
+@chapter Getting Started with @code{awk}
+@cindex script, definition of
+@cindex rule, definition of
+@cindex program, definition of
+@cindex basic function of @code{awk}
+
+The basic function of @code{awk} is to search files for lines (or other
+units of text) that contain certain patterns.  When a line matches one
+of the patterns, @code{awk} performs specified actions on that line.
+@code{awk} keeps processing input lines in this way until the end of the
+input files are reached.
+
+@cindex data-driven languages
+@cindex procedural languages
+@cindex language, data-driven
+@cindex language, procedural
+Programs in @code{awk} are different from programs in most other languages,
+because @code{awk} programs are @dfn{data-driven}; that is, you describe
+the data you wish to work with, and then what to do when you find it.
+Most other languages are @dfn{procedural}; you have to describe, in great
+detail, every step the program is to take.  When working with procedural
+languages, it is usually much
+harder to clearly describe the data your program will process.
+For this reason, @code{awk} programs are often refreshingly easy to both
+write and read.
+
+@cindex program, definition of
+@cindex rule, definition of
+When you run @code{awk}, you specify an @code{awk} @dfn{program} that
+tells @code{awk} what to do.  The program consists of a series of
+@dfn{rules}.  (It may also contain @dfn{function definitions},
+an advanced feature which we will ignore for now.
+@xref{User-defined, ,User-defined Functions}.)  Each rule specifies one
+pattern to search for, and one action to perform when that pattern is found.
+
+Syntactically, a rule consists of a pattern followed by an action.  The
+action is enclosed in curly braces to separate it from the pattern.
+Rules are usually separated by newlines.  Therefore, an @code{awk}
+program looks like this:
+
+@example
+@var{pattern} @{ @var{action} @}
+@var{pattern} @{ @var{action} @}
+@dots{}
+@end example
+
+@menu
+* Names::                       What name to use to find @code{awk}.
+* Running gawk::                How to run @code{gawk} programs; includes
+                                command line syntax.
+* Very Simple::                 A very simple example.
+* Two Rules::                   A less simple one-line example with two rules.
+* More Complex::                A more complex example.
+* Statements/Lines::            Subdividing or combining statements into
+                                lines.
+* Other Features::              Other Features of @code{awk}.
+* When::                        When to use @code{gawk} and when to use other
+                                things.
+@end menu
+
+@node Names, Running gawk , Getting Started, Getting Started
+@section A Rose By Any Other Name
+
+@cindex old @code{awk} vs. new @code{awk}
+@cindex new @code{awk} vs. old @code{awk}
+The @code{awk} language has evolved over the years. Full details are
+provided in @ref{Language History, ,The Evolution of the @code{awk} Language}.
+The language described in this @value{DOCUMENT}
+is often referred to as ``new @code{awk}.''
+
+Because of this, many systems have multiple
+versions of @code{awk}.
+Some systems have an @code{awk} utility that implements the
+original version of the @code{awk} language, and a @code{nawk} utility
+for the new version.  Others have an @code{oawk} for the ``old @code{awk}''
+language, and plain @code{awk} for the new one.  Still others only
+have one version, usually the new one.@footnote{Often, these systems
+use @code{gawk} for their @code{awk} implementation!}
+
+All in all, this makes it difficult for you to know which version of
+@code{awk} you should run when writing your programs.  The best advice
+we can give here is to check your local documentation. Look for @code{awk},
+@code{oawk}, and @code{nawk}, as well as for @code{gawk}. Chances are, you
+will have some version of new @code{awk} on your system, and that is what
+you should use when running your programs.  (Of course, if you're reading
+this @value{DOCUMENT}, chances are good that you have @code{gawk}!)
+
+Throughout this @value{DOCUMENT}, whenever we refer to a language feature
+that should be available in any complete implementation of POSIX @code{awk},
+we simply use the term @code{awk}.  When referring to a feature that is
+specific to the GNU implementation, we use the term @code{gawk}.
+
+@node Running gawk, Very Simple, Names, Getting Started
+@section How to Run @code{awk} Programs
+
+@cindex command line formats
+@cindex running @code{awk} programs
+There are several ways to run an @code{awk} program.  If the program is
+short, it is easiest to include it in the command that runs @code{awk},
+like this:
+
+@example
+awk '@var{program}' @var{input-file1} @var{input-file2} @dots{}
+@end example
+
+@noindent
+where @var{program} consists of a series of patterns and actions, as
+described earlier.
+(The reason for the single quotes is described below, in
+@ref{One-shot, ,One-shot Throw-away @code{awk} Programs}.)
+
+When the program is long, it is usually more convenient to put it in a file
+and run it with a command like this:
+
+@example
+awk -f @var{program-file} @var{input-file1} @var{input-file2} @dots{}
+@end example
+
+@menu
+* One-shot::                    Running a short throw-away @code{awk} program.
+* Read Terminal::               Using no input files (input from terminal
+                                instead).
+* Long::                        Putting permanent @code{awk} programs in
+                                files.
+* Executable Scripts::          Making self-contained @code{awk} programs.
+* Comments::                    Adding documentation to @code{gawk} programs.
+@end menu
+
+@node One-shot, Read Terminal, Running gawk, Running gawk
+@subsection One-shot Throw-away @code{awk} Programs
+
+Once you are familiar with @code{awk}, you will often type in simple
+programs the moment you want to use them.  Then you can write the
+program as the first argument of the @code{awk} command, like this:
+
+@example
+awk '@var{program}' @var{input-file1} @var{input-file2} @dots{}
+@end example
+
+@noindent
+where @var{program} consists of a series of @var{patterns} and
+@var{actions}, as described earlier.
+
+@cindex single quotes, why needed
+This command format instructs the @dfn{shell}, or command interpreter,
+to start @code{awk} and use the @var{program} to process records in the
+input file(s).  There are single quotes around @var{program} so that
+the shell doesn't interpret any @code{awk} characters as special shell
+characters.  They also cause the shell to treat all of @var{program} as
+a single argument for @code{awk} and allow @var{program} to be more
+than one line long.
+
+This format is also useful for running short or medium-sized @code{awk}
+programs from shell scripts, because it avoids the need for a separate
+file for the @code{awk} program.  A self-contained shell script is more
+reliable since there are no other files to misplace.
+
+@ref{One-liners, , Useful One Line Programs}, presents several short,
+self-contained programs.
+
+@iftex
+@page
+@end iftex
+As an interesting side point, the command
+
+@example
+awk '/foo/' @var{files} @dots{}
+@end example
+
+@noindent
+is essentially the same as
+
+@cindex @code{egrep}
+@example
+egrep foo @var{files} @dots{}
+@end example
+
+@node Read Terminal, Long, One-shot, Running gawk
+@subsection Running @code{awk} without Input Files
+
+@cindex standard input
+@cindex input, standard
+You can also run @code{awk} without any input files.  If you type the
+command line:
+
+@example
+awk '@var{program}'
+@end example
+
+@noindent
+then @code{awk} applies the @var{program} to the @dfn{standard input},
+which usually means whatever you type on the terminal.  This continues
+until you indicate end-of-file by typing @kbd{Control-d}.
+(On other operating systems, the end-of-file character may be different.
+For example, on OS/2 and MS-DOS, it is @kbd{Control-z}.)
+
+For example, the following program prints a friendly piece of advice
+(from Douglas Adams' @cite{The Hitchhiker's Guide to the Galaxy}),
+to keep you from worrying about the complexities of computer programming
+(@samp{BEGIN} is a feature we haven't discussed yet).
+
+@example
+$ awk "BEGIN @{ print \"Don't Panic!\" @}"
+@print{} Don't Panic!
+@end example
+
+@cindex quoting, shell
+@cindex shell quoting
+This program does not read any input.  The @samp{\} before each of the
+inner double quotes is necessary because of the shell's quoting rules,
+in particular because it mixes both single quotes and double quotes.
+
+This next simple @code{awk} program
+emulates the @code{cat} utility; it copies whatever you type at the
+keyboard to its standard output. (Why this works is explained shortly.)
+
+@example
+$ awk '@{ print @}'
+Now is the time for all good men
+@print{} Now is the time for all good men
+to come to the aid of their country.
+@print{} to come to the aid of their country.
+Four score and seven years ago, ...
+@print{} Four score and seven years ago, ...
+What, me worry?
+@print{} What, me worry?
+@kbd{Control-d}
+@end example
+
+@node Long, Executable Scripts, Read Terminal, Running gawk
+@subsection Running Long Programs
+
+@cindex running long programs
+@cindex @code{-f} option
+@cindex program file
+@cindex file, @code{awk} program
+Sometimes your @code{awk} programs can be very long.  In this case it is
+more convenient to put the program into a separate file.  To tell
+@code{awk} to use that file for its program, you type:
+
+@example
+awk -f @var{source-file} @var{input-file1} @var{input-file2} @dots{}
+@end example
+
+The @samp{-f} instructs the @code{awk} utility to get the @code{awk} program
+from the file @var{source-file}.  Any file name can be used for
+@var{source-file}.  For example, you could put the program:
+
+@example
+BEGIN @{ print "Don't Panic!" @}
+@end example
+
+@noindent
+into the file @file{advice}.  Then this command:
+
+@example
+awk -f advice
+@end example
+
+@noindent
+does the same thing as this one:
+
+@example
+awk "BEGIN @{ print \"Don't Panic!\" @}"
+@end example
+
+@cindex quoting, shell
+@cindex shell quoting
+@noindent
+which was explained earlier (@pxref{Read Terminal, ,Running @code{awk} without Input Files}).
+Note that you don't usually need single quotes around the file name that you
+specify with @samp{-f}, because most file names don't contain any of the shell's
+special characters.  Notice that in @file{advice}, the @code{awk}
+program did not have single quotes around it.  The quotes are only needed
+for programs that are provided on the @code{awk} command line.
+
+If you want to identify your @code{awk} program files clearly as such,
+you can add the extension @file{.awk} to the file name.  This doesn't
+affect the execution of the @code{awk} program, but it does make
+``housekeeping'' easier.
+
+@node Executable Scripts, Comments, Long, Running gawk
+@subsection Executable @code{awk} Programs
+@cindex executable scripts
+@cindex scripts, executable
+@cindex self contained programs
+@cindex program, self contained
+@cindex @code{#!} (executable scripts)
+
+Once you have learned @code{awk}, you may want to write self-contained
+@code{awk} scripts, using the @samp{#!} script mechanism.  You can do
+this on many Unix systems@footnote{The @samp{#!} mechanism works on
+Linux systems,
+Unix systems derived from Berkeley Unix, System V Release 4, and some System
+V Release 3 systems.} (and someday on the GNU system).
+
+For example, you could update the file @file{advice} to look like this:
+
+@example
+#! /bin/awk -f
+
+BEGIN    @{ print "Don't Panic!" @}
+@end example
+
+@noindent
+After making this file executable (with the @code{chmod} utility), you
+can simply type @samp{advice}
+at the shell, and the system will arrange to run @code{awk} @footnote{The
+line beginning with @samp{#!} lists the full file name of an interpreter
+to be run, and an optional initial command line argument to pass to that
+interpreter.  The operating system then runs the interpreter with the given
+argument and the full argument list of the executed program.  The first argument
+in the list is the full file name of the @code{awk} program.  The rest of the
+argument list will either be options to @code{awk}, or data files,
+or both.} as if you had typed @samp{awk -f advice}.
+
+@example
+$ advice
+@print{} Don't Panic!
+@end example
+
+@noindent
+Self-contained @code{awk} scripts are useful when you want to write a
+program which users can invoke without their having to know that the program is
+written in @code{awk}.
+
+@cindex shell scripts
+@cindex scripts, shell
+Some older systems do not support the @samp{#!} mechanism. You can get a
+similar effect using a regular shell script.  It would look something
+like this:
+
+@example
+: The colon ensures execution by the standard shell.
+awk '@var{program}' "$@@"
+@end example
+
+Using this technique, it is @emph{vital} to enclose the @var{program} in
+single quotes to protect it from interpretation by the shell.  If you
+omit the quotes, only a shell wizard can predict the results.
+
+The @code{"$@@"} causes the shell to forward all the command line
+arguments to the @code{awk} program, without interpretation.  The first
+line, which starts with a colon, is used so that this shell script will
+work even if invoked by a user who uses the C shell.  (Not all older systems
+obey this convention, but many do.)
+@c 2e:
+@c Someday: (See @cite{The Bourne Again Shell}, by ??.)
+
+@node Comments,  , Executable Scripts, Running gawk
+@subsection Comments in @code{awk} Programs
+@cindex @code{#} (comment)
+@cindex comments
+@cindex use of comments
+@cindex documenting @code{awk} programs
+@cindex programs, documenting
+
+A @dfn{comment} is some text that is included in a program for the sake
+of human readers; it is not really part of the program.  Comments
+can explain what the program does, and how it works.  Nearly all
+programming languages have provisions for comments, because programs are
+typically hard to understand without their extra help.
+
+In the @code{awk} language, a comment starts with the sharp sign
+character, @samp{#}, and continues to the end of the line.
+The @samp{#} does not have to be the first character on the line. The
+@code{awk} language ignores the rest of a line following a sharp sign.
+For example, we could have put the following into @file{advice}:
+
+@example
+# This program prints a nice friendly message.  It helps
+# keep novice users from being afraid of the computer.
+BEGIN    @{ print "Don't Panic!" @}
+@end example
+
+You can put comment lines into keyboard-composed throw-away @code{awk}
+programs also, but this usually isn't very useful; the purpose of a
+comment is to help you or another person understand the program at
+a later time.
+
+@node Very Simple, Two Rules, Running gawk, Getting Started
+@section A Very Simple Example
+
+The following command runs a simple @code{awk} program that searches the
+input file @file{BBS-list} for the string of characters: @samp{foo}.  (A
+string of characters is usually called a @dfn{string}.
+The term @dfn{string} is perhaps based on similar usage in English, such
+as ``a string of pearls,'' or, ``a string of cars in a train.'')
+
+@example
+awk '/foo/ @{ print $0 @}' BBS-list
+@end example
+
+@noindent
+When lines containing @samp{foo} are found, they are printed, because
+@w{@samp{print $0}} means print the current line.  (Just @samp{print} by
+itself means the same thing, so we could have written that
+instead.)
+
+You will notice that slashes, @samp{/}, surround the string @samp{foo}
+in the @code{awk} program.  The slashes indicate that @samp{foo}
+is a pattern to search for.  This type of pattern is called a
+@dfn{regular expression}, and is covered in more detail later
+(@pxref{Regexp, ,Regular Expressions}).
+The pattern is allowed to match parts of words.
+There are
+single-quotes around the @code{awk} program so that the shell won't
+interpret any of it as special shell characters.
+
+Here is what this program prints:
+
+@example
+@group
+$ awk '/foo/ @{ print $0 @}' BBS-list
+@print{} fooey        555-1234     2400/1200/300     B
+@print{} foot         555-6699     1200/300          B
+@print{} macfoo       555-6480     1200/300          A
+@print{} sabafoo      555-2127     1200/300          C
+@end group
+@end example
+
+@cindex action, default
+@cindex pattern, default
+@cindex default action
+@cindex default pattern
+In an @code{awk} rule, either the pattern or the action can be omitted,
+but not both.  If the pattern is omitted, then the action is performed
+for @emph{every} input line.  If the action is omitted, the default
+action is to print all lines that match the pattern.
+
+@cindex empty action
+@cindex action, empty
+Thus, we could leave out the action (the @code{print} statement and the curly
+braces) in the above example, and the result would be the same: all
+lines matching the pattern @samp{foo} would be printed.  By comparison,
+omitting the @code{print} statement but retaining the curly braces makes an
+empty action that does nothing; then no lines would be printed.
+
+@node Two Rules, More Complex, Very Simple, Getting Started
+@section An Example with Two Rules
+@cindex how @code{awk} works
+
+The @code{awk} utility reads the input files one line at a
+time.  For each line, @code{awk} tries the patterns of each of the rules.
+If several patterns match then several actions are run, in the order in
+which they appear in the @code{awk} program.  If no patterns match, then
+no actions are run.
+
+After processing all the rules (perhaps none) that match the line,
+@code{awk} reads the next line (however,
+@pxref{Next Statement, ,The @code{next} Statement},
+and also @pxref{Nextfile Statement, ,The @code{nextfile} Statement}).
+This continues until the end of the file is reached.
+
+For example, the @code{awk} program:
+
+@example
+/12/  @{ print $0 @}
+/21/  @{ print $0 @}
+@end example
+
+@noindent
+contains two rules.  The first rule has the string @samp{12} as the
+pattern and @samp{print $0} as the action.  The second rule has the
+string @samp{21} as the pattern and also has @samp{print $0} as the
+action.  Each rule's action is enclosed in its own pair of braces.
+
+This @code{awk} program prints every line that contains the string
+@samp{12} @emph{or} the string @samp{21}.  If a line contains both
+strings, it is printed twice, once by each rule.
+
+This is what happens if we run this program on our two sample data files,
+@file{BBS-list} and @file{inventory-shipped}, as shown here:
+
+@example
+$ awk '/12/ @{ print $0 @}
+>      /21/ @{ print $0 @}' BBS-list inventory-shipped
+@print{} aardvark     555-5553     1200/300          B
+@print{} alpo-net     555-3412     2400/1200/300     A
+@print{} barfly       555-7685     1200/300          A
+@print{} bites        555-1675     2400/1200/300     A
+@print{} core         555-2912     1200/300          C
+@print{} fooey        555-1234     2400/1200/300     B
+@print{} foot         555-6699     1200/300          B
+@print{} macfoo       555-6480     1200/300          A
+@print{} sdace        555-3430     2400/1200/300     A
+@print{} sabafoo      555-2127     1200/300          C
+@print{} sabafoo      555-2127     1200/300          C
+@print{} Jan  21  36  64 620
+@print{} Apr  21  70  74 514
+@end example
+
+@noindent
+Note how the line in @file{BBS-list} beginning with @samp{sabafoo}
+was printed twice, once for each rule.
+
+@node More Complex, Statements/Lines, Two Rules, Getting Started
+@section A More Complex Example
+
+@ignore
+We have to use ls -lg here to get portable output across Unix systems.
+The POSIX ls matches this behavior too. Sigh.
+@end ignore
+Here is an example to give you an idea of what typical @code{awk}
+programs do.  This example shows how @code{awk} can be used to
+summarize, select, and rearrange the output of another utility.  It uses
+features that haven't been covered yet, so don't worry if you don't
+understand all the details.
+
+@example
+ls -lg | awk '$6 == "Nov" @{ sum += $5 @}
+             END @{ print sum @}'
+@end example
+
+@cindex @code{csh}, backslash continuation
+@cindex backslash continuation in @code{csh}
+This command prints the total number of bytes in all the files in the
+current directory that were last modified in November (of any year).
+(In the C shell you would need to type a semicolon and then a backslash
+at the end of the first line; in a POSIX-compliant shell, such as the
+Bourne shell or Bash, the GNU Bourne-Again shell, you can type the example
+as shown.)
+@ignore
+FIXME:  how can users tell what shell they are running?  Need a footnote
+or something, but getting into this is a distraction.
+@end ignore
+
+The @w{@samp{ls -lg}} part of this example is a system command that gives
+you a listing of the files in a directory, including file size and the date
+the file was last modified. Its output looks like this:
+
+@example
+-rw-r--r--  1 arnold   user   1933 Nov  7 13:05 Makefile
+-rw-r--r--  1 arnold   user  10809 Nov  7 13:03 gawk.h
+-rw-r--r--  1 arnold   user    983 Apr 13 12:14 gawk.tab.h
+-rw-r--r--  1 arnold   user  31869 Jun 15 12:20 gawk.y
+-rw-r--r--  1 arnold   user  22414 Nov  7 13:03 gawk1.c
+-rw-r--r--  1 arnold   user  37455 Nov  7 13:03 gawk2.c
+-rw-r--r--  1 arnold   user  27511 Dec  9 13:07 gawk3.c
+-rw-r--r--  1 arnold   user   7989 Nov  7 13:03 gawk4.c
+@end example
+
+@noindent
+The first field contains read-write permissions, the second field contains
+the number of links to the file, and the third field identifies the owner of
+the file. The fourth field identifies the group of the file.
+The fifth field contains the size of the file in bytes.  The
+sixth, seventh and eighth fields contain the month, day, and time,
+respectively, that the file was last modified.  Finally, the ninth field
+contains the name of the file.
+
+@cindex automatic initialization
+@cindex initialization, automatic
+The @samp{$6 == "Nov"} in our @code{awk} program is an expression that
+tests whether the sixth field of the output from @w{@samp{ls -lg}}
+matches the string @samp{Nov}.  Each time a line has the string
+@samp{Nov} for its sixth field, the action @samp{sum += $5} is
+performed.  This adds the fifth field (the file size) to the variable
+@code{sum}.  As a result, when @code{awk} has finished reading all the
+input lines, @code{sum} is the sum of the sizes of files whose
+lines matched the pattern.  (This works because @code{awk} variables
+are automatically initialized to zero.)
+
+After the last line of output from @code{ls} has been processed, the
+@code{END} rule is executed, and the value of @code{sum} is
+printed.  In this example, the value of @code{sum} would be 80600.
+
+These more advanced @code{awk} techniques are covered in later sections
+(@pxref{Action Overview, ,Overview of Actions}).  Before you can move on to more
+advanced @code{awk} programming, you have to know how @code{awk} interprets
+your input and displays your output.  By manipulating fields and using
+@code{print} statements, you can produce some very useful and impressive
+looking reports.
+
+@node Statements/Lines, Other Features, More Complex, Getting Started
+@section @code{awk} Statements Versus Lines
+@cindex line break
+@cindex newline
+
+Most often, each line in an @code{awk} program is a separate statement or
+separate rule, like this:
+
+@example
+awk '/12/  @{ print $0 @}
+     /21/  @{ print $0 @}' BBS-list inventory-shipped
+@end example
+
+However, @code{gawk} will ignore newlines after any of the following:
+
+@example
+,    @{    ?    :    ||    &&    do    else
+@end example
+
+@noindent
+A newline at any other point is considered the end of the statement.
+(Splitting lines after @samp{?} and @samp{:} is a minor @code{gawk}
+extension.  The @samp{?} and @samp{:} referred to here is the 
+three operand conditional expression described in
+@ref{Conditional Exp, ,Conditional Expressions}.)
+
+@cindex backslash continuation
+@cindex continuation of lines
+@cindex line continuation
+If you would like to split a single statement into two lines at a point
+where a newline would terminate it, you can @dfn{continue} it by ending the
+first line with a backslash character, @samp{\}.  The backslash must be
+the final character on the line to be recognized as a continuation
+character.  This is allowed absolutely anywhere in the statement, even
+in the middle of a string or regular expression.  For example:
+
+@example
+awk '/This regular expression is too long, so continue it\
+ on the next line/ @{ print $1 @}'
+@end example
+
+@noindent
+@cindex portability issues
+We have generally not used backslash continuation in the sample programs
+in this @value{DOCUMENT}.  Since in @code{gawk} there is no limit on the
+length of a line, it is never strictly necessary; it just makes programs
+more readable.  For this same reason, as well as for clarity, we have
+kept most statements short in the sample programs presented throughout
+the @value{DOCUMENT}.  Backslash continuation is most useful when your
+@code{awk} program is in a separate source file, instead of typed in on
+the command line.  You should also note that many @code{awk}
+implementations are more particular about where you may use backslash
+continuation. For example, they may not allow you to split a string
+constant using backslash continuation.  Thus, for maximal portability of
+your @code{awk} programs, it is best not to split your lines in the
+middle of a regular expression or a string.
+
+@cindex @code{csh}, backslash continuation
+@cindex backslash continuation in @code{csh}
+@strong{Caution: backslash continuation does not work as described above
+with the C shell.}  Continuation with backslash works for @code{awk}
+programs in files, and also for one-shot programs @emph{provided} you
+are using a POSIX-compliant shell, such as the Bourne shell or Bash, the
+GNU Bourne-Again shell.  But the C shell (@code{csh}) behaves
+differently!  There, you must use two backslashes in a row, followed by
+a newline.  Note also that when using the C shell, @emph{every} newline
+in your awk program must be escaped with a backslash. To illustrate:
+
+@example
+% awk 'BEGIN @{ \
+?   print \\
+?       "hello, world" \
+? @}'
+@print{} hello, world
+@end example
+
+@noindent
+Here, the @samp{%} and @samp{?} are the C shell's primary and secondary
+prompts, analogous to the standard shell's @samp{$} and @samp{>}.
+
+@code{awk} is a line-oriented language.  Each rule's action has to
+begin on the same line as the pattern.  To have the pattern and action
+on separate lines, you @emph{must} use backslash continuation---there
+is no other way.
+
+@cindex multiple statements on one line
+When @code{awk} statements within one rule are short, you might want to put
+more than one of them on a line.  You do this by separating the statements
+with a semicolon, @samp{;}.
+
+This also applies to the rules themselves.
+Thus, the previous program could have been written:
+
+@example
+/12/ @{ print $0 @} ; /21/ @{ print $0 @}
+@end example
+
+@noindent
+@strong{Note:} the requirement that rules on the same line must be
+separated with a semicolon was not in the original @code{awk}
+language; it was added for consistency with the treatment of statements
+within an action.
+
+@node Other Features, When, Statements/Lines, Getting Started
+@section Other Features of @code{awk}
+
+The @code{awk} language provides a number of predefined, or built-in variables, which
+your programs can use to get information from @code{awk}.  There are other
+variables your program can set to control how @code{awk} processes your
+data.
+
+In addition, @code{awk} provides a number of built-in functions for doing
+common computational and string related operations.
+
+As we develop our presentation of the @code{awk} language, we introduce
+most of the variables and many of the functions. They are defined
+systematically in @ref{Built-in Variables}, and
+@ref{Built-in, ,Built-in Functions}.
+
+@node When,  , Other Features, Getting Started
+@section When to Use @code{awk}
+
+@cindex when to use @code{awk}
+@cindex applications of @code{awk}
+You might wonder how @code{awk} might be useful for you.  Using
+utility programs, advanced patterns, field separators, arithmetic
+statements, and other selection criteria, you can produce much more
+complex output.  The @code{awk} language is very useful for producing
+reports from large amounts of raw data, such as summarizing information
+from the output of other utility programs like @code{ls}.  
+(@xref{More Complex, ,A More Complex Example}.)
+
+Programs written with @code{awk} are usually much smaller than they would
+be in other languages.  This makes @code{awk} programs easy to compose and
+use.  Often, @code{awk} programs can be quickly composed at your terminal,
+used once, and thrown away.  Since @code{awk} programs are interpreted, you
+can avoid the (usually lengthy) compilation part of the typical
+edit-compile-test-debug cycle of software development.
+
+Complex programs have been written in @code{awk}, including a complete
+retargetable assembler for eight-bit microprocessors (@pxref{Glossary}, for
+more information) and a microcode assembler for a special purpose Prolog
+computer.  However, @code{awk}'s capabilities are strained by tasks of
+such complexity.
+
+If you find yourself writing @code{awk} scripts of more than, say, a few
+hundred lines, you might consider using a different programming
+language.  Emacs Lisp is a good choice if you need sophisticated string
+or pattern matching capabilities.  The shell is also good at string and
+pattern matching; in addition, it allows powerful use of the system
+utilities.  More conventional languages, such as C, C++, and Lisp, offer
+better facilities for system programming and for managing the complexity
+of large programs.  Programs in these languages may require more lines
+of source code than the equivalent @code{awk} programs, but they are
+easier to maintain and usually run more efficiently.
+
+@node One-liners, Regexp, Getting Started, Top
+@chapter Useful One Line Programs
+
+@cindex one-liners
+Many useful @code{awk} programs are short, just a line or two.  Here is a
+collection of useful, short programs to get you started.  Some of these
+programs contain constructs that haven't been covered yet.  The description
+of the program will give you a good idea of what is going on, but please
+read the rest of the @value{DOCUMENT} to become an @code{awk} expert!
+
+Most of the examples use a data file named @file{data}.  This is just a
+placeholder; if you were to use these programs yourself, you would substitute
+your own file names for @file{data}.
+
+@ifinfo
+Since you are reading this in Info, each line of the example code is
+enclosed in quotes, to represent text that you would type literally.
+The examples themselves represent shell commands that use single quotes
+to keep the shell from interpreting the contents of the program.
+When reading the examples, focus on the text between the open and close
+quotes.
+@end ifinfo
+
+@table @code
+@item awk '@{ if (length($0) > max) max = length($0) @}
+@itemx @ @ @ @ @ END @{ print max @}' data
+This program prints the length of the longest input line.
+
+@item awk 'length($0) > 80' data
+This program prints every line that is longer than 80 characters.  The sole
+rule has a relational expression as its pattern, and has no action (so the
+default action, printing the record, is used).
+
+@item expand@ data@ |@ awk@ '@{ if (x < length()) x = length() @}
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ END @{ print "maximum line length is " x @}'
+This program prints the length of the longest line in @file{data}.  The input
+is processed by the @code{expand} program to change tabs into spaces,
+so the widths compared are actually the right-margin columns.
+
+@item awk 'NF > 0' data
+This program prints every line that has at least one field.  This is an
+easy way to delete blank lines from a file (or rather, to create a new
+file similar to the old file but from which the blank lines have been
+deleted).
+
+@c Karl Berry points out that new users probably don't want to see
+@c multiple ways to do things, just the `best' way.  He's probably
+@c right.  At some point it might be worth adding something about there
+@c often being multiple ways to do things in awk, but for now we'll
+@c just take this one out.
+@ignore
+@item awk '@{ if (NF > 0) print @}' data
+This program also prints every line that has at least one field.  Here we
+allow the rule to match every line, and then decide in the action whether
+to print.
+@end ignore
+
+@item awk@ 'BEGIN@ @{@ for (i = 1; i <= 7; i++)
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ print int(101 * rand()) @}'
+This program prints seven random numbers from zero to 100, inclusive.
+
+@item ls -lg @var{files} | awk '@{ x += $5 @} ; END @{ print "total bytes: " x @}'
+This program prints the total number of bytes used by @var{files}.
+
+@item ls -lg @var{files} | awk '@{ x += $5 @}
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ END @{ print "total K-bytes: " (x + 1023)/1024 @}'
+This program prints the total number of kilobytes used by @var{files}.
+
+@item awk -F: '@{ print $1 @}' /etc/passwd | sort
+This program prints a sorted list of the login names of all users.
+
+@item awk 'END @{ print NR @}' data
+This program counts lines in a file.
+
+@item awk 'NR % 2' data
+This program prints the even numbered lines in the data file.
+If you were to use the expression @samp{NR % 2 == 1} instead,
+it would print the odd number lines.
+@end table
+
+@node Regexp, Reading Files, One-liners, Top
+@chapter Regular Expressions
+@cindex pattern, regular expressions
+@cindex regexp
+@cindex regular expression
+@cindex regular expressions as patterns
+
+A @dfn{regular expression}, or @dfn{regexp}, is a way of describing a
+set of strings.
+Because regular expressions are such a fundamental part of @code{awk}
+programming, their format and use deserve a separate chapter.
+
+A regular expression enclosed in slashes (@samp{/})
+is an @code{awk} pattern that matches every input record whose text
+belongs to that set.
+
+The simplest regular expression is a sequence of letters, numbers, or
+both.  Such a regexp matches any string that contains that sequence.
+Thus, the regexp @samp{foo} matches any string containing @samp{foo}.
+Therefore, the pattern @code{/foo/} matches any input record containing
+the three characters @samp{foo}, @emph{anywhere} in the record.  Other
+kinds of regexps let you specify more complicated classes of strings.
+
+@iftex
+Initially, the examples will be simple. As we explain more about how
+regular expressions work, we will present more complicated examples.
+@end iftex
+
+@menu
+* Regexp Usage::                How to Use Regular Expressions.
+* Escape Sequences::            How to write non-printing characters.
+* Regexp Operators::            Regular Expression Operators.
+* GNU Regexp Operators::        Operators specific to GNU software.
+* Case-sensitivity::            How to do case-insensitive matching.
+* Leftmost Longest::            How much text matches.
+* Computed Regexps::            Using Dynamic Regexps.
+@end menu
+
+@node Regexp Usage, Escape Sequences, Regexp, Regexp
+@section How to Use Regular Expressions
+
+A regular expression can be used as a pattern by enclosing it in
+slashes.  Then the regular expression is tested against the
+entire text of each record.  (Normally, it only needs
+to match some part of the text in order to succeed.)  For example, this
+prints the second field of each record that contains the three
+characters @samp{foo} anywhere in it:
+
+@example
+@group
+$ awk '/foo/ @{ print $2 @}' BBS-list
+@print{} 555-1234
+@print{} 555-6699
+@print{} 555-6480
+@print{} 555-2127
+@end group
+@end example
+
+@cindex regexp matching operators
+@cindex string-matching operators
+@cindex operators, string-matching
+@cindex operators, regexp matching
+@cindex regexp match/non-match operators
+@cindex @code{~} operator
+@cindex @code{!~} operator
+Regular expressions can also be used in matching expressions.  These
+expressions allow you to specify the string to match against; it need
+not be the entire current input record.  The two operators, @samp{~}
+and @samp{!~}, perform regular expression comparisons.  Expressions
+using these operators can be used as patterns or in @code{if},
+@code{while}, @code{for}, and @code{do} statements.
+@ifinfo
+@c adding this xref in TeX screws up the formatting too much
+(@xref{Statements, ,Control Statements in Actions}.)
+@end ifinfo
+
+@table @code
+@item @var{exp} ~ /@var{regexp}/
+This is true if the expression @var{exp} (taken as a string)
+is matched by @var{regexp}.  The following example matches, or selects,
+all input records with the upper-case letter @samp{J} somewhere in the
+first field:
+
+@example
+@group
+$ awk '$1 ~ /J/' inventory-shipped
+@print{} Jan  13  25  15 115
+@print{} Jun  31  42  75 492
+@print{} Jul  24  34  67 436
+@print{} Jan  21  36  64 620
+@end group
+@end example
+
+So does this:
+
+@example
+awk '@{ if ($1 ~ /J/) print @}' inventory-shipped
+@end example
+
+@item @var{exp} !~ /@var{regexp}/
+This is true if the expression @var{exp} (taken as a character string)
+is @emph{not} matched by @var{regexp}.  The following example matches,
+or selects, all input records whose first field @emph{does not} contain
+the upper-case letter @samp{J}:
+
+@example
+@group
+$ awk '$1 !~ /J/' inventory-shipped
+@print{} Feb  15  32  24 226
+@print{} Mar  15  24  34 228
+@print{} Apr  31  52  63 420
+@print{} May  16  34  29 208
+@dots{}
+@end group
+@end example
+@end table
+
+@cindex regexp constant
+When a regexp is written enclosed in slashes, like @code{/foo/}, we call it
+a @dfn{regexp constant}, much like @code{5.27} is a numeric constant, and
+@code{"foo"} is a string constant.
+
+@node Escape Sequences, Regexp Operators, Regexp Usage, Regexp
+@section Escape Sequences
+
+@cindex escape sequence notation
+Some characters cannot be included literally in string constants
+(@code{"foo"}) or regexp constants (@code{/foo/}).  You represent them
+instead with @dfn{escape sequences}, which are character sequences
+beginning with a backslash (@samp{\}).
+
+One use of an escape sequence is to include a double-quote character in
+a string constant.  Since a plain double-quote would end the string, you
+must use @samp{\"} to represent an actual double-quote character as a
+part of the string.  For example:
+
+@example
+$ awk 'BEGIN @{ print "He said \"hi!\" to her." @}'
+@print{} He said "hi!" to her.
+@end example
+
+The  backslash character itself is another character that cannot be
+included normally; you write @samp{\\} to put one backslash in the
+string or regexp.  Thus, the string whose contents are the two characters
+@samp{"} and @samp{\} must be written @code{"\"\\"}.
+
+Another use of backslash is to represent unprintable characters
+such as tab or newline.  While there is nothing to stop you from entering most
+unprintable characters directly in a string constant or regexp constant,
+they may look ugly.
+
+Here is a table of all the escape sequences used in @code{awk}, and
+what they represent. Unless noted otherwise, all of these escape
+sequences apply to both string constants and regexp constants.
+
+@iftex
+@page
+@end iftex
+@c @cartouche
+@table @code
+@item \\
+A literal backslash, @samp{\}.
+
+@cindex @code{awk} language, V.4 version
+@item \a
+The ``alert'' character, @kbd{Control-g}, ASCII code 7 (BEL).
+
+@item \b
+Backspace, @kbd{Control-h}, ASCII code 8 (BS).
+
+@item \f
+Formfeed, @kbd{Control-l}, ASCII code 12 (FF).
+
+@item \n
+Newline, @kbd{Control-j}, ASCII code 10 (LF).
+
+@item \r
+Carriage return, @kbd{Control-m}, ASCII code 13 (CR).
+
+@item \t
+Horizontal tab, @kbd{Control-i}, ASCII code 9 (HT).
+
+@cindex @code{awk} language, V.4 version
+@item \v
+Vertical tab, @kbd{Control-k}, ASCII code 11 (VT).
+
+@item \@var{nnn}
+The octal value @var{nnn}, where @var{nnn} are one to three digits
+between @samp{0} and @samp{7}.  For example, the code for the ASCII ESC
+(escape) character is @samp{\033}.
+
+@cindex @code{awk} language, V.4 version
+@cindex @code{awk} language, POSIX version
+@cindex POSIX @code{awk}
+@item \x@var{hh}@dots{}
+The hexadecimal value @var{hh}, where @var{hh} are hexadecimal
+digits (@samp{0} through @samp{9} and either @samp{A} through @samp{F} or
+@samp{a} through @samp{f}).  Like the same construct in ANSI C, the escape
+sequence continues until the first non-hexadecimal digit is seen.  However,
+using more than two hexadecimal digits produces undefined results. (The
+@samp{\x} escape sequence is not allowed in POSIX @code{awk}.)
+
+@item \/
+A literal slash (necessary for regexp constants only).
+You use this when you wish to write a regexp
+constant that contains a slash. Since the regexp is delimited by
+slashes, you need to escape the slash that is part of the pattern,
+in order to tell @code{awk} to keep processing the rest of the regexp.
+
+@item \"
+A literal double-quote (necessary for string constants only).
+You use this when you wish to write a string
+constant that contains a double-quote. Since the string is delimited by
+double-quotes, you need to escape the quote that is part of the string,
+in order to tell @code{awk} to keep processing the rest of the string.
+@end table
+@c @end cartouche
+
+In @code{gawk}, there are additional two character sequences that begin
+with backslash that have special meaning in regexps.
+@xref{GNU Regexp Operators, ,Additional Regexp Operators Only in @code{gawk}}.
+
+In a string constant,
+what happens if you place a backslash before something that is not one of
+the characters listed above?  POSIX @code{awk} purposely leaves this case
+undefined.  There are two choices.
+
+@itemize @bullet
+@item
+Strip the backslash out.  This is what Unix @code{awk} and @code{gawk} both do.
+For example, @code{"a\qc"} is the same as @code{"aqc"}.
+
+@item
+Leave the backslash alone.  Some other @code{awk} implementations do this.
+In such implementations, @code{"a\qc"} is the same as if you had typed
+@code{"a\\qc"}.
+@end itemize
+
+In a regexp, a backslash before any character that is not in the above table,
+and not listed in
+@ref{GNU Regexp Operators, ,Additional Regexp Operators Only in @code{gawk}},
+means that the next character should be taken literally, even if it would
+normally be a regexp operator. E.g., @code{/a\+b/} matches the three
+characters @samp{a+b}.
+
+@cindex portability issues
+For complete portability, do not use a backslash before any character not
+listed in the table above.
+
+Another interesting question arises. Suppose you use an octal or hexadecimal
+escape to represent a regexp metacharacter
+(@pxref{Regexp Operators, ,  Regular Expression Operators}).
+Does @code{awk} treat the character as literal character, or as a regexp
+operator?
+
+@cindex dark corner
+It turns out that historically, such characters were taken literally (d.c.).
+However, the POSIX standard indicates that they should be treated
+as real metacharacters, and this is what @code{gawk} does.
+However, in compatibility mode (@pxref{Options, ,Command Line Options}),
+@code{gawk} treats the characters represented by octal and hexadecimal
+escape sequences literally when used in regexp constants. Thus,
+@code{/a\52b/} is equivalent to @code{/a\*b/}.
+
+To summarize:
+
+@enumerate 1
+@item
+The escape sequences in the table above are always processed first,
+for both string constants and regexp constants. This happens very early,
+as soon as @code{awk} reads your program.
+
+@item
+@code{gawk} processes both regexp constants and dynamic regexps
+(@pxref{Computed Regexps, ,Using Dynamic Regexps}),
+for the special operators listed in
+@ref{GNU Regexp Operators, ,Additional Regexp Operators Only in @code{gawk}}.
+
+@item
+A backslash before any other character means to treat that character
+literally.
+@end enumerate
+
+@node Regexp Operators, GNU Regexp Operators, Escape Sequences, Regexp
+@section Regular Expression Operators
+@cindex metacharacters
+@cindex regular expression metacharacters
+@cindex regexp operators
+
+You can combine regular expressions with the following characters,
+called @dfn{regular expression operators}, or @dfn{metacharacters}, to
+increase the power and versatility of regular expressions.
+
+The escape sequences described
+@iftex
+above
+@end iftex
+in @ref{Escape Sequences},
+are valid inside a regexp.  They are introduced by a @samp{\}.  They
+are recognized and converted into the corresponding real characters as
+the very first step in processing regexps.
+
+Here is a table of metacharacters.  All characters that are not escape
+sequences and that are not listed in the table stand for themselves.
+
+@iftex
+@page
+@end iftex
+@table @code
+@item \
+This is used to suppress the special meaning of a character when
+matching.  For example:
+
+@example
+\$
+@end example
+
+@noindent
+matches the character @samp{$}.
+
+@cindex anchors in regexps
+@cindex regexp, anchors
+@item ^
+This matches the beginning of a string.  For example:
+
+@example
+^@@chapter
+@end example
+
+@noindent
+matches the @samp{@@chapter} at the beginning of a string, and can be used
+to identify chapter beginnings in Texinfo source files.
+The @samp{^} is known as an @dfn{anchor}, since it anchors the pattern to
+matching only at the beginning of the string.
+
+It is important to realize that @samp{^} does not match the beginning of
+a line embedded in a string.  In this example the condition is not true:
+
+@example
+if ("line1\nLINE 2" ~ /^L/) @dots{}
+@end example
+
+@item $
+This is similar to @samp{^}, but it matches only at the end of a string.
+For example:
+
+@example
+p$
+@end example
+
+@noindent
+matches a record that ends with a @samp{p}.  The @samp{$} is also an anchor,
+and also does not match the end of a line embedded in a string.  In this
+example the condition is not true:
+
+@example
+if ("line1\nLINE 2" ~ /1$/) @dots{}
+@end example
+
+@item .
+The period, or dot, matches any single character,
+@emph{including} the newline character.  For example:
+
+@example
+.P
+@end example
+
+@noindent
+matches any single character followed by a @samp{P} in a string.  Using
+concatenation we can make a regular expression like @samp{U.A}, which
+matches any three-character sequence that begins with @samp{U} and ends
+with @samp{A}.
+
+@cindex @code{awk} language, POSIX version
+@cindex POSIX @code{awk}
+In strict POSIX mode (@pxref{Options, ,Command Line Options}),
+@samp{.} does not match the @sc{nul}
+character, which is a character with all bits equal to zero.
+Otherwise, @sc{nul} is just another character. Other versions of @code{awk}
+may not be able to match the @sc{nul} character.
+
+@ignore
+2e: Add stuff that character list is the POSIX terminology. In other
+    literature known as character set or character class.
+@end ignore
+
+@cindex character list
+@item [@dots{}]
+This is called a @dfn{character list}.  It matches any @emph{one} of the
+characters that are enclosed in the square brackets.  For example:
+
+@example
+[MVX]
+@end example
+
+@noindent
+matches any one of the characters @samp{M}, @samp{V}, or @samp{X} in a
+string.
+
+Ranges of characters are indicated by using a hyphen between the beginning
+and ending characters, and enclosing the whole thing in brackets.  For
+example:
+
+@example
+[0-9]
+@end example
+
+@noindent
+matches any digit.
+Multiple ranges are allowed. E.g., the list @code{@w{[A-Za-z0-9]}} is a
+common way to express the idea of ``all alphanumeric characters.''
+
+To include one of the characters @samp{\}, @samp{]}, @samp{-} or @samp{^} in a
+character list, put a @samp{\} in front of it.  For example:
+
+@example
+[d\]]
+@end example
+
+@noindent
+matches either @samp{d}, or @samp{]}.
+
+@cindex @code{egrep}
+This treatment of @samp{\} in character lists
+is compatible with other @code{awk}
+implementations, and is also mandated by POSIX.
+The regular expressions in @code{awk} are a superset
+of the POSIX specification for Extended Regular Expressions (EREs).
+POSIX EREs are based on the regular expressions accepted by the
+traditional @code{egrep} utility.
+
+@cindex character classes
+@cindex @code{awk} language, POSIX version
+@cindex POSIX @code{awk}
+@dfn{Character classes} are a new feature introduced in the POSIX standard.
+A character class is a special notation for describing
+lists of characters that have a specific attribute, but where the 
+actual characters themselves can vary from country to country and/or
+from character set to character set.  For example, the notion of what
+is an alphabetic character differs in the USA and in France.
+
+A character class is only valid in a regexp @emph{inside} the
+brackets of a character list.  Character classes consist of @samp{[:},
+a keyword denoting the class, and @samp{:]}.  Here are the character
+classes defined by the POSIX standard.
+
+@table @code
+@item [:alnum:]
+Alphanumeric characters.
+
+@item [:alpha:]
+Alphabetic characters.
+
+@item [:blank:]
+Space and tab characters.
+
+@item [:cntrl:]
+Control characters.
+
+@item [:digit:]
+Numeric characters.
+
+@item [:graph:]
+Characters that are printable and are also visible.
+(A space is printable, but not visible, while an @samp{a} is both.)
+
+@item [:lower:]
+Lower-case alphabetic characters.
+
+@item [:print:]
+Printable characters (characters that are not control characters.)
+
+@item [:punct:]
+Punctuation characters (characters that are not letter, digits,
+control characters, or space characters).
+
+@item [:space:]
+Space characters (such as space, tab, and formfeed, to name a few).
+
+@item [:upper:]
+Upper-case alphabetic characters.
+
+@item [:xdigit:]
+Characters that are hexadecimal digits.
+@end table
+
+For example, before the POSIX standard, to match alphanumeric
+characters, you had to write @code{/[A-Za-z0-9]/}.  If your
+character set had other alphabetic characters in it, this would not
+match them.  With the POSIX character classes, you can write
+@code{/[[:alnum:]]/}, and this will match @emph{all} the alphabetic
+and numeric characters in your character set.
+
+@cindex collating elements
+Two additional special sequences can appear in character lists.
+These apply to non-ASCII character sets, which can have single symbols
+(called @dfn{collating elements}) that are represented with more than one
+character, as well as several characters that are equivalent for
+@dfn{collating}, or sorting, purposes.  (E.g., in French, a plain ``e''
+and a grave-accented
+@iftex
+``@`e''
+@end iftex
+@ifinfo
+``e''
+@end ifinfo
+are equivalent.)
+
+@table @asis
+@cindex collating symbols
+@item Collating Symbols
+A @dfn{collating symbol} is a multi-character collating element enclosed in
+@samp{[.} and @samp{.]}.  For example, if @samp{ch} is a collating element,
+then @code{[[.ch.]]} is a regexp that matches this collating element, while
+@code{[ch]} is a regexp that matches either @samp{c} or @samp{h}.
+
+@cindex equivalence classes
+@item Equivalence Classes
+An @dfn{equivalence class} is a list of equivalent characters enclosed in
+@samp{[=} and @samp{=]}.
+@iftex
+Thus, @code{[[=e@`e=]]} is regexp that matches either @samp{e} or @samp{@`e}.
+@end iftex
+@ifinfo
+Because Info files use plain ASCII characters, it is not possible to present
+a realistic equivalence class example here.
+@end ifinfo
+@end table
+
+These features are very valuable in non-English speaking locales.
+
+@strong{Caution:} The library functions that @code{gawk} uses for regular
+expression matching currently only recognize POSIX character classes;
+they do not recognize collating symbols or equivalence classes.
+@c maybe one day ...
+
+@cindex complemented character list
+@cindex character list, complemented
+@item [^ @dots{}]
+This is a @dfn{complemented character list}.  The first character after
+the @samp{[} @emph{must} be a @samp{^}.  It matches any characters
+@emph{except} those in the square brackets, or newline.  For example:
+
+@example
+[^0-9]
+@end example
+
+@noindent
+matches any character that is not a digit.
+
+@item |
+This is the @dfn{alternation operator}, and it is used to specify
+alternatives.  For example:
+
+@example
+^P|[0-9]
+@end example
+
+@noindent
+matches any string that matches either @samp{^P} or @samp{[0-9]}.  This
+means it matches any string that starts with @samp{P} or contains a digit.
+
+The alternation applies to the largest possible regexps on either side.
+In other words, @samp{|} has the lowest precedence of all the regular
+expression operators.
+
+@item (@dots{})
+Parentheses are used for grouping in regular expressions as in
+arithmetic.  They can be used to concatenate regular expressions
+containing the alternation operator, @samp{|}.  For example,
+@samp{@@(samp|code)\@{[^@}]+\@}} matches both @samp{@@code@{foo@}} and
+@samp{@@samp@{bar@}}. (These are Texinfo formatting control sequences.)
+
+@item *
+This symbol means that the preceding regular expression is to be
+repeated as many times as necessary to find a match.  For example:
+
+@example
+ph*
+@end example
+
+@noindent
+applies the @samp{*} symbol to the preceding @samp{h} and looks for matches
+of one @samp{p} followed by any number of @samp{h}s.  This will also match
+just @samp{p} if no @samp{h}s are present.
+
+The @samp{*} repeats the @emph{smallest} possible preceding expression.
+(Use parentheses if you wish to repeat a larger expression.)  It finds
+as many repetitions as possible.  For example:
+
+@example
+awk '/\(c[ad][ad]*r x\)/ @{ print @}' sample
+@end example
+
+@noindent
+prints every record in @file{sample} containing a string of the form
+@samp{(car x)}, @samp{(cdr x)}, @samp{(cadr x)}, and so on.
+Notice the escaping of the parentheses by preceding them
+with backslashes.
+
+@item +
+This symbol is similar to @samp{*}, but the preceding expression must be
+matched at least once.  This means that:
+
+@example
+wh+y
+@end example
+
+@noindent
+would match @samp{why} and @samp{whhy} but not @samp{wy}, whereas
+@samp{wh*y} would match all three of these strings.  This is a simpler
+way of writing the last @samp{*} example:
+
+@example
+awk '/\(c[ad]+r x\)/ @{ print @}' sample
+@end example
+
+@item ?
+This symbol is similar to @samp{*}, but the preceding expression can be
+matched either once or not at all.  For example:
+
+@example
+fe?d
+@end example
+
+@noindent
+will match @samp{fed} and @samp{fd}, but nothing else.
+
+@cindex @code{awk} language, POSIX version
+@cindex POSIX @code{awk}
+@cindex interval expressions
+@item @{@var{n}@}
+@itemx @{@var{n},@}
+@itemx @{@var{n},@var{m}@}
+One or two numbers inside braces denote an @dfn{interval expression}.
+If there is one number in the braces, the preceding regexp is repeated
+@var{n} times.
+If there are two numbers separated by a comma, the preceding regexp is
+repeated @var{n} to @var{m} times.
+If there is one number followed by a comma, then the preceding regexp
+is repeated at least @var{n} times.
+
+@table @code
+@item wh@{3@}y
+matches @samp{whhhy} but not @samp{why} or @samp{whhhhy}.
+
+@item wh@{3,5@}y
+matches @samp{whhhy} or @samp{whhhhy} or @samp{whhhhhy}, only.
+
+@item wh@{2,@}y
+matches @samp{whhy} or @samp{whhhy}, and so on.
+@end table
+
+Interval expressions were not traditionally available in @code{awk}.
+As part of the POSIX standard they were added, to make @code{awk}
+and @code{egrep} consistent with each other.
+
+However, since old programs may use @samp{@{} and @samp{@}} in regexp
+constants, by default @code{gawk} does @emph{not} match interval expressions
+in regexps.  If either @samp{--posix} or @samp{--re-interval} are specified
+(@pxref{Options, , Command Line Options}), then interval expressions
+are allowed in regexps.
+@end table
+
+@cindex precedence, regexp operators
+@cindex regexp operators, precedence of
+In regular expressions, the @samp{*}, @samp{+}, and @samp{?} operators,
+as well as the braces @samp{@{} and @samp{@}},
+have
+the highest precedence, followed by concatenation, and finally by @samp{|}.
+As in arithmetic, parentheses can change how operators are grouped.
+
+If @code{gawk} is in compatibility mode
+(@pxref{Options, ,Command Line Options}),
+character classes and interval expressions are not available in
+regular expressions.
+
+The next
+@ifinfo
+node
+@end ifinfo
+@iftex
+section
+@end iftex
+discusses the GNU-specific regexp operators, and provides
+more detail concerning how command line options affect the way @code{gawk}
+interprets the characters in regular expressions.
+
+@node GNU Regexp Operators, Case-sensitivity, Regexp Operators, Regexp
+@section Additional Regexp Operators Only in @code{gawk}
+
+@c This section adapted from the regex-0.12 manual
+
+@cindex regexp operators, GNU specific
+GNU software that deals with regular expressions provides a number of
+additional regexp operators.  These operators are described in this
+section, and are specific to @code{gawk}; they are not available in other
+@code{awk} implementations.
+
+@cindex word, regexp definition of
+Most of the additional operators are for dealing with word matching.
+For our purposes, a @dfn{word} is a sequence of one or more letters, digits,
+or underscores (@samp{_}).
+
+@table @code
+@cindex @code{\w} regexp operator
+@item \w
+This operator matches any word-constituent character, i.e.@: any
+letter, digit, or underscore. Think of it as a short-hand for
+@c @w{@code{[A-Za-z0-9_]}} or
+@w{@code{[[:alnum:]_]}}.
+
+@cindex @code{\W} regexp operator
+@item \W
+This operator matches any character that is not word-constituent.
+Think of it as a short-hand for
+@c @w{@code{[^A-Za-z0-9_]}} or
+@w{@code{[^[:alnum:]_]}}.
+
+@cindex @code{\<} regexp operator
+@item \<
+This operator matches the empty string at the beginning of a word.
+For example, @code{/\<away/} matches @samp{away}, but not
+@samp{stowaway}.
+
+@cindex @code{\>} regexp operator
+@item \>
+This operator matches the empty string at the end of a word.
+For example, @code{/stow\>/} matches @samp{stow}, but not @samp{stowaway}.
+
+@cindex @code{\y} regexp operator
+@cindex word boundaries, matching
+@item \y
+This operator matches the empty string at either the beginning or the
+end of a word (the word boundar@strong{y}).  For example, @samp{\yballs?\y}
+matches either @samp{ball} or @samp{balls} as a separate word.
+
+@cindex @code{\B} regexp operator
+@item \B
+This operator matches the empty string within a word. In other words,
+@samp{\B} matches the empty string that occurs between two
+word-constituent characters. For example,
+@code{/\Brat\B/} matches @samp{crate}, but it does not match @samp{dirty rat}.
+@samp{\B} is essentially the opposite of @samp{\y}.
+@end table
+
+There are two other operators that work on buffers.  In Emacs, a
+@dfn{buffer} is, naturally, an Emacs buffer.  For other programs, the
+regexp library routines that @code{gawk} uses consider the entire
+string to be matched as the buffer.
+
+For @code{awk}, since @samp{^} and @samp{$} always work in terms
+of the beginning and end of strings, these operators don't add any
+new capabilities.  They are provided for compatibility with other GNU
+software.
+
+@cindex buffer matching operators
+@table @code
+@cindex @code{\`} regexp operator
+@item \`
+This operator matches the empty string at the
+beginning of the buffer.
+
+@cindex @code{\'} regexp operator
+@item \'
+This operator matches the empty string at the
+end of the buffer.
+@end table
+
+In other GNU software, the word boundary operator is @samp{\b}. However,
+that conflicts with the @code{awk} language's definition of @samp{\b}
+as backspace, so @code{gawk} uses a different letter.
+
+An alternative method would have been to require two backslashes in the
+GNU operators, but this was deemed to be too confusing, and the current
+method of using @samp{\y} for the GNU @samp{\b} appears to be the
+lesser of two evils.
+
+@c NOTE!!! Keep this in sync with the same table in the summary appendix!
+@cindex regexp, effect of command line options
+The various command line options
+(@pxref{Options, ,Command Line Options})
+control how @code{gawk} interprets characters in regexps.
+
+@table @asis
+@item No options
+In the default case, @code{gawk} provide all the facilities of
+POSIX regexps and the GNU regexp operators described
+@iftex
+above.
+@end iftex
+@ifinfo
+in @ref{Regexp Operators, ,Regular Expression Operators}.
+@end ifinfo
+However, interval expressions are not supported.
+
+@item @code{--posix}
+Only POSIX regexps are supported, the GNU operators are not special
+(e.g., @samp{\w} matches a literal @samp{w}).  Interval expressions
+are allowed.
+
+@item @code{--traditional}
+Traditional Unix @code{awk} regexps are matched. The GNU operators
+are not special, interval expressions are not available, and neither
+are the POSIX character classes (@code{[[:alnum:]]} and so on).
+Characters described by octal and hexadecimal escape sequences are
+treated literally, even if they represent regexp metacharacters.
+
+@item @code{--re-interval}
+Allow interval expressions in regexps, even if @samp{--traditional}
+has been provided.
+@end table
+
+@node Case-sensitivity, Leftmost Longest, GNU Regexp Operators, Regexp
+@section Case-sensitivity in Matching
+
+@cindex case sensitivity
+@cindex ignoring case
+Case is normally significant in regular expressions, both when matching
+ordinary characters (i.e.@: not metacharacters), and inside character
+sets.  Thus a @samp{w} in a regular expression matches only a lower-case
+@samp{w} and not an upper-case @samp{W}.
+
+The simplest way to do a case-independent match is to use a character
+list: @samp{[Ww]}.  However, this can be cumbersome if you need to use it
+often; and it can make the regular expressions harder to
+read.  There are two alternatives that you might prefer.
+
+One way to do a case-insensitive match at a particular point in the
+program is to convert the data to a single case, using the
+@code{tolower} or @code{toupper} built-in string functions (which we
+haven't discussed yet;
+@pxref{String Functions, ,Built-in Functions for String Manipulation}).
+For example:
+
+@example
+tolower($1) ~ /foo/  @{ @dots{} @}
+@end example
+
+@noindent
+converts the first field to lower-case before matching against it.
+This will work in any POSIX-compliant implementation of @code{awk}.
+
+@cindex differences between @code{gawk} and @code{awk}
+@cindex @code{~} operator
+@cindex @code{!~} operator
+@vindex IGNORECASE
+Another method, specific to @code{gawk}, is to set the variable
+@code{IGNORECASE} to a non-zero value (@pxref{Built-in Variables}).
+When @code{IGNORECASE} is not zero, @emph{all} regexp and string
+operations ignore case.  Changing the value of
+@code{IGNORECASE} dynamically controls the case sensitivity of your
+program as it runs.  Case is significant by default because
+@code{IGNORECASE} (like most variables) is initialized to zero.
+
+@example
+x = "aB"
+if (x ~ /ab/) @dots{}   # this test will fail
+
+IGNORECASE = 1
+if (x ~ /ab/) @dots{}   # now it will succeed
+@end example
+
+In general, you cannot use @code{IGNORECASE} to make certain rules
+case-insensitive and other rules case-sensitive, because there is no way
+to set @code{IGNORECASE} just for the pattern of a particular rule.
+@ignore
+This isn't quite true. Consider:
+
+	IGNORECASE=1 && /foObAr/ { .... }
+	IGNORECASE=0 || /foobar/ { .... }
+
+But that's pretty bad style and I don't want to get into it at this
+late date.
+@end ignore
+To do this, you must use character lists or @code{tolower}.  However, one
+thing you can do only with @code{IGNORECASE} is turn case-sensitivity on
+or off dynamically for all the rules at once.
+
+@code{IGNORECASE} can be set on the command line, or in a @code{BEGIN} rule
+(@pxref{Other Arguments, ,Other Command Line Arguments}; also
+@pxref{Using BEGIN/END, ,Startup and Cleanup Actions}).
+Setting @code{IGNORECASE} from the command line is a way to make
+a program case-insensitive without having to edit it.
+
+Prior to version 3.0 of @code{gawk}, the value of @code{IGNORECASE}
+only affected regexp operations. It did not affect string comparison
+with @samp{==}, @samp{!=}, and so on.
+Beginning with version 3.0, both regexp and string comparison
+operations are affected by @code{IGNORECASE}.
+
+@cindex ISO 8859-1
+@cindex ISO Latin-1
+Beginning with version 3.0 of @code{gawk}, the equivalences between upper-case
+and lower-case characters are based on the ISO-8859-1 (ISO Latin-1)
+character set. This character set is a superset of the traditional 128
+ASCII characters, that also provides a number of characters suitable
+for use with European languages.
+@ignore
+A pure ASCII character set can be used instead if @code{gawk} is compiled
+with @samp{-DUSE_PURE_ASCII}.
+@end ignore
+
+The value of @code{IGNORECASE} has no effect if @code{gawk} is in
+compatibility mode (@pxref{Options, ,Command Line Options}).
+Case is always significant in compatibility mode.
+
+@node Leftmost Longest, Computed Regexps, Case-sensitivity, Regexp
+@section How Much Text Matches?
+
+@cindex leftmost longest match
+@cindex matching, leftmost longest
+Consider the following example:
+
+@example
+echo aaaabcd | awk '@{ sub(/a+/, "<A>"); print @}'
+@end example
+
+This example uses the @code{sub} function (which we haven't discussed yet,
+@pxref{String Functions, ,Built-in Functions for String Manipulation})
+to make a change to the input record. Here, the regexp @code{/a+/}
+indicates ``one or more @samp{a} characters,'' and the replacement
+text is @samp{<A>}.
+
+The input contains four @samp{a} characters.  What will the output be?
+In other words, how many is ``one or more''---will @code{awk} match two,
+three, or all four @samp{a} characters?
+
+The answer is, @code{awk} (and POSIX) regular expressions always match
+the leftmost, @emph{longest} sequence of input characters that can
+match.  Thus, in this example, all four @samp{a} characters are
+replaced with @samp{<A>}.
+
+@example
+$ echo aaaabcd | awk '@{ sub(/a+/, "<A>"); print @}'
+@print{} <A>bcd
+@end example
+
+For simple match/no-match tests, this is not so important. But when doing
+regexp-based field and record splitting, and
+text matching and substitutions with the @code{match}, @code{sub}, @code{gsub},
+and @code{gensub} functions, it is very important.
+@ifinfo
+@xref{String Functions, ,Built-in Functions for String Manipulation},
+for more information on these functions.
+@end ifinfo
+Understanding this principle is also important for regexp-based record
+and field splitting (@pxref{Records, ,How Input is Split into Records},
+and also @pxref{Field Separators, ,Specifying How Fields are Separated}).
+
+@node Computed Regexps, , Leftmost Longest, Regexp
+@section Using Dynamic Regexps
+
+@cindex computed regular expressions
+@cindex regular expressions, computed
+@cindex dynamic regular expressions
+@cindex regexp, dynamic
+@cindex @code{~} operator
+@cindex @code{!~} operator
+The right hand side of a @samp{~} or @samp{!~} operator need not be a
+regexp constant (i.e.@: a string of characters between slashes).  It may
+be any expression.  The expression is evaluated, and converted if
+necessary to a string; the contents of the string are used as the
+regexp.  A regexp that is computed in this way is called a @dfn{dynamic
+regexp}.  For example:
+
+@example
+BEGIN @{ identifier_regexp = "[A-Za-z_][A-Za-z_0-9]+" @}
+$0 ~ identifier_regexp    @{ print @}
+@end example
+
+@noindent
+sets @code{identifier_regexp} to a regexp that describes @code{awk}
+variable names, and tests if the input record matches this regexp.
+
+@strong{Caution:} When using the @samp{~} and @samp{!~}
+operators, there is a difference between a regexp constant
+enclosed in slashes, and a string constant enclosed in double quotes.
+If you are going to use a string constant, you have to understand that
+the string is in essence scanned @emph{twice}; the first time when
+@code{awk} reads your program, and the second time when it goes to
+match the string on the left-hand side of the operator with the pattern
+on the right.  This is true of any string valued expression (such as
+@code{identifier_regexp} above), not just string constants.
+
+@cindex regexp constants, difference between slashes and quotes
+What difference does it make if the string is
+scanned twice? The answer has to do with escape sequences, and particularly
+with backslashes.  To get a backslash into a regular expression inside a
+string, you have to type two backslashes.
+
+For example, @code{/\*/} is a regexp constant for a literal @samp{*}.
+Only one backslash is needed.  To do the same thing with a string,
+you would have to type @code{"\\*"}.  The first backslash escapes the
+second one, so that the string actually contains the
+two characters @samp{\} and @samp{*}.
+
+@cindex common mistakes
+@cindex mistakes, common
+@cindex errors, common
+Given that you can use both regexp and string constants to describe
+regular expressions, which should you use?  The answer is ``regexp
+constants,'' for several reasons.
+
+@enumerate 1
+@item
+String constants are more complicated to write, and
+more difficult to read. Using regexp constants makes your programs
+less error-prone.  Not understanding the difference between the two
+kinds of constants is a common source of errors.
+
+@item
+It is also more efficient to use regexp constants: @code{awk} can note
+that you have supplied a regexp and store it internally in a form that
+makes pattern matching more efficient.  When using a string constant,
+@code{awk} must first convert the string into this internal form, and
+then perform the pattern matching.
+
+@item
+Using regexp constants is better style; it shows clearly that you
+intend a regexp match.
+@end enumerate
+
+@node Reading Files, Printing, Regexp, Top
+@chapter Reading Input Files
+
+@cindex reading files
+@cindex input
+@cindex standard input
+@vindex FILENAME
+In the typical @code{awk} program, all input is read either from the
+standard input (by default the keyboard, but often a pipe from another
+command) or from files whose names you specify on the @code{awk} command
+line.  If you specify input files, @code{awk} reads them in order, reading
+all the data from one before going on to the next.  The name of the current
+input file can be found in the built-in variable @code{FILENAME}
+(@pxref{Built-in Variables}).
+
+The input is read in units called @dfn{records}, and processed by the
+rules of your program one record at a time.
+By default, each record is one line.  Each
+record is automatically split into chunks called @dfn{fields}.
+This makes it more convenient for programs to work on the parts of a record.
+
+On rare occasions you will need to use the @code{getline} command.
+The  @code{getline} command is valuable, both because it
+can do explicit input from any number of files, and because the files
+used with it do not have to be named on the @code{awk} command line
+(@pxref{Getline, ,Explicit Input with @code{getline}}).
+
+@menu
+* Records::                     Controlling how data is split into records.
+* Fields::                      An introduction to fields.
+* Non-Constant Fields::         Non-constant Field Numbers.
+* Changing Fields::             Changing the Contents of a Field.
+* Field Separators::            The field separator and how to change it.
+* Constant Size::               Reading constant width data.
+* Multiple Line::               Reading multi-line records.
+* Getline::                     Reading files under explicit program control
+                                using the @code{getline} function.
+@end menu
+
+@node Records, Fields, Reading Files, Reading Files
+@section How Input is Split into Records
+
+@cindex record separator, @code{RS}
+@cindex changing the record separator
+@cindex record, definition of
+@vindex RS
+The @code{awk} utility divides the input for your @code{awk}
+program into records and fields.
+Records are separated by a character called the @dfn{record separator}.
+By default, the record separator is the newline character.
+This is why records are, by default, single lines.
+You can use a different character for the record separator by
+assigning the character to the built-in variable @code{RS}.
+
+You can change the value of @code{RS} in the @code{awk} program,
+like any other variable, with the
+assignment operator, @samp{=} (@pxref{Assignment Ops, ,Assignment Expressions}).
+The new record-separator character should be enclosed in quotation marks,
+which indicate
+a string constant.  Often the right time to do this is at the beginning
+of execution, before any input has been processed, so that the very
+first record will be read with the proper separator.  To do this, use
+the special @code{BEGIN} pattern
+(@pxref{BEGIN/END, ,The @code{BEGIN} and @code{END} Special Patterns}).  For
+example:
+
+@example
+awk 'BEGIN @{ RS = "/" @} ; @{ print $0 @}' BBS-list
+@end example
+
+@noindent
+changes the value of @code{RS} to @code{"/"}, before reading any input.
+This is a string whose first character is a slash; as a result, records
+are separated by slashes.  Then the input file is read, and the second
+rule in the @code{awk} program (the action with no pattern) prints each
+record.  Since each @code{print} statement adds a newline at the end of
+its output, the effect of this @code{awk} program is to copy the input
+with each slash changed to a newline.  Here are the results of running
+the program on @file{BBS-list}:
+
+@example
+@group
+$ awk 'BEGIN @{ RS = "/" @} ; @{ print $0 @}' BBS-list
+@print{} aardvark     555-5553     1200
+@print{} 300          B
+@print{} alpo-net     555-3412     2400
+@print{} 1200
+@print{} 300     A
+@print{} barfly       555-7685     1200
+@print{} 300          A
+@print{} bites        555-1675     2400
+@print{} 1200
+@print{} 300     A
+@print{} camelot      555-0542     300               C
+@print{} core         555-2912     1200
+@print{} 300          C
+@print{} fooey        555-1234     2400
+@print{} 1200
+@print{} 300     B
+@print{} foot         555-6699     1200
+@print{} 300          B
+@print{} macfoo       555-6480     1200
+@print{} 300          A
+@print{} sdace        555-3430     2400
+@print{} 1200
+@print{} 300     A
+@print{} sabafoo      555-2127     1200
+@print{} 300          C
+@print{}
+@end group
+@end example
+
+@noindent
+Note that the entry for the @samp{camelot} BBS is not split.
+In the original data file
+(@pxref{Sample Data Files,  , Data Files for the Examples}),
+the line looks like this:
+
+@example
+camelot      555-0542     300               C
+@end example
+
+@noindent
+It only has one baud rate; there are no slashes in the record.
+
+Another way to change the record separator is on the command line,
+using the variable-assignment feature
+(@pxref{Other Arguments, ,Other Command Line Arguments}).
+
+@example
+awk '@{ print $0 @}' RS="/" BBS-list
+@end example
+
+@noindent
+This sets @code{RS} to @samp{/} before processing @file{BBS-list}.
+
+Using an unusual character such as @samp{/} for the record separator
+produces correct behavior in the vast majority of cases.  However,
+the following (extreme) pipeline prints a surprising @samp{1}.  There
+is one field, consisting of a newline.  The value of the built-in
+variable @code{NF} is the number of fields in the current record.
+
+@example
+$ echo | awk 'BEGIN @{ RS = "a" @} ; @{ print NF @}'
+@print{} 1
+@end example
+
+@cindex dark corner
+@noindent
+Reaching the end of an input file terminates the current input record,
+even if the last character in the file is not the character in @code{RS}
+(d.c.).
+
+@cindex empty string
+The empty string, @code{""} (a string of no characters), has a special meaning
+as the value of @code{RS}: it means that records are separated
+by one or more blank lines, and nothing else.
+@xref{Multiple Line, ,Multiple-Line Records}, for more details.
+
+If you change the value of @code{RS} in the middle of an @code{awk} run,
+the new value is used to delimit subsequent records, but the record
+currently being processed (and records already processed) are not
+affected.
+
+@vindex RT
+@cindex record terminator, @code{RT}
+@cindex terminator, record
+@cindex differences between @code{gawk} and @code{awk}
+After the end of the record has been determined, @code{gawk}
+sets the variable @code{RT} to the text in the input that matched
+@code{RS}.
+
+@cindex regular expressions as record separators
+The value of @code{RS} is in fact not limited to a one-character
+string.  It can be any regular expression
+(@pxref{Regexp, ,Regular Expressions}).
+In general, each record
+ends at the next string that matches the regular expression; the next
+record starts at the end of the matching string.  This general rule is
+actually at work in the usual case, where @code{RS} contains just a
+newline: a record ends at the beginning of the next matching string (the
+next newline in the input) and the following record starts just after
+the end of this string (at the first character of the following line).
+The newline, since it matches @code{RS}, is not part of either record.
+
+When @code{RS} is a single character, @code{RT} will
+contain the same single character. However, when @code{RS} is a
+regular expression, then @code{RT} becomes more useful; it contains
+the actual input text that matched the regular expression.
+
+The following example illustrates both of these features.
+It sets @code{RS} equal to a regular expression that
+matches either a newline, or a series of one or more upper-case letters
+with optional leading and/or trailing white space
+(@pxref{Regexp, , Regular Expressions}).
+
+@example
+$ echo record 1 AAAA record 2 BBBB record 3 |
+> gawk 'BEGIN @{ RS = "\n|( *[[:upper:]]+ *)" @}
+>             @{ print "Record =", $0, "and RT =", RT @}'
+@print{} Record = record 1 and RT =  AAAA 
+@print{} Record = record 2 and RT =  BBBB 
+@print{} Record = record 3 and RT = 
+@print{}
+@end example
+
+@noindent
+The final line of output has an extra blank line. This is because the
+value of @code{RT} is a newline, and then the @code{print} statement
+supplies its own terminating newline.
+
+@xref{Simple Sed, ,A Simple Stream Editor}, for a more useful example
+of @code{RS} as a regexp and @code{RT}.
+
+@cindex differences between @code{gawk} and @code{awk}
+The use of @code{RS} as a regular expression and the @code{RT}
+variable are @code{gawk} extensions; they are not available in
+compatibility mode
+(@pxref{Options, ,Command Line Options}).
+In compatibility mode, only the first character of the value of
+@code{RS} is used to determine the end of the record.
+
+@cindex number of records, @code{NR}, @code{FNR}
+@vindex NR
+@vindex FNR
+The @code{awk} utility keeps track of the number of records that have
+been read so far from the current input file.  This value is stored in a
+built-in variable called @code{FNR}.  It is reset to zero when a new
+file is started.  Another built-in variable, @code{NR}, is the total
+number of input records read so far from all data files.  It starts at zero
+but is never automatically reset to zero.
+
+@node Fields, Non-Constant Fields, Records, Reading Files
+@section Examining Fields
+
+@cindex examining fields
+@cindex fields
+@cindex accessing fields
+When @code{awk} reads an input record, the record is
+automatically separated or @dfn{parsed} by the interpreter into chunks
+called @dfn{fields}.  By default, fields are separated by whitespace,
+like words in a line.
+Whitespace in @code{awk} means any string of one or more spaces and/or
+tabs; other characters such as newline, formfeed, and so on, that are
+considered whitespace by other languages are @emph{not} considered
+whitespace by @code{awk}.
+
+The purpose of fields is to make it more convenient for you to refer to
+these pieces of the record.  You don't have to use them---you can
+operate on the whole record if you wish---but fields are what make
+simple @code{awk} programs so powerful.
+
+@cindex @code{$} (field operator)
+@cindex field operator @code{$}
+To refer to a field in an @code{awk} program, you use a dollar-sign,
+@samp{$}, followed by the number of the field you want.  Thus, @code{$1}
+refers to the first field, @code{$2} to the second, and so on.  For
+example, suppose the following is a line of input:
+
+@example
+This seems like a pretty nice example.
+@end example
+
+@noindent
+Here the first field, or @code{$1}, is @samp{This}; the second field, or
+@code{$2}, is @samp{seems}; and so on.  Note that the last field,
+@code{$7}, is @samp{example.}.  Because there is no space between the
+@samp{e} and the @samp{.}, the period is considered part of the seventh
+field.
+
+@vindex NF
+@cindex number of fields, @code{NF}
+@code{NF} is a built-in variable whose value
+is the number of fields in the current record.
+@code{awk} updates the value of @code{NF} automatically, each time
+a record is read.
+
+No matter how many fields there are, the last field in a record can be
+represented by @code{$NF}.  So, in the example above, @code{$NF} would
+be the same as @code{$7}, which is @samp{example.}.  Why this works is
+explained below (@pxref{Non-Constant Fields, ,Non-constant Field Numbers}).
+If you try to reference a field beyond the last one, such as @code{$8}
+when the record has only seven fields, you get the empty string.
+@c the empty string acts like 0 in some contexts, but I don't want to
+@c get into that here....
+
+@code{$0}, which looks like a reference to the ``zeroth'' field, is
+a special case: it represents the whole input record.  @code{$0} is
+used when you are not interested in fields.
+
+Here are some more examples:
+
+@example
+@group
+$ awk '$1 ~ /foo/ @{ print $0 @}' BBS-list
+@print{} fooey        555-1234     2400/1200/300     B
+@print{} foot         555-6699     1200/300          B
+@print{} macfoo       555-6480     1200/300          A
+@print{} sabafoo      555-2127     1200/300          C
+@end group
+@end example
+
+@noindent
+This example prints each record in the file @file{BBS-list} whose first
+field contains the string @samp{foo}.  The operator @samp{~} is called a
+@dfn{matching operator}
+(@pxref{Regexp Usage, , How to Use Regular Expressions});
+it tests whether a string (here, the field @code{$1}) matches a given regular
+expression.
+
+By contrast, the following example
+looks for @samp{foo} in @emph{the entire record} and prints the first
+field and the last field for each input record containing a
+match.
+
+@example
+@group
+$ awk '/foo/ @{ print $1, $NF @}' BBS-list
+@print{} fooey B
+@print{} foot B
+@print{} macfoo A
+@print{} sabafoo C
+@end group
+@end example
+
+@node Non-Constant Fields, Changing Fields, Fields, Reading Files
+@section Non-constant Field Numbers
+
+The number of a field does not need to be a constant.  Any expression in
+the @code{awk} language can be used after a @samp{$} to refer to a
+field.  The value of the expression specifies the field number.  If the
+value is a string, rather than a number, it is converted to a number.
+Consider this example:
+
+@example
+awk '@{ print $NR @}'
+@end example
+
+@noindent
+Recall that @code{NR} is the number of records read so far: one in the
+first record, two in the second, etc.  So this example prints the first
+field of the first record, the second field of the second record, and so
+on.  For the twentieth record, field number 20 is printed; most likely,
+the record has fewer than 20 fields, so this prints a blank line.
+
+Here is another example of using expressions as field numbers:
+
+@example
+awk '@{ print $(2*2) @}' BBS-list
+@end example
+
+@code{awk} must evaluate the expression @samp{(2*2)} and use
+its value as the number of the field to print.  The @samp{*} sign
+represents multiplication, so the expression @samp{2*2} evaluates to four.
+The parentheses are used so that the multiplication is done before the
+@samp{$} operation; they are necessary whenever there is a binary
+operator in the field-number expression.  This example, then, prints the
+hours of operation (the fourth field) for every line of the file
+@file{BBS-list}.  (All of the @code{awk} operators are listed, in
+order of decreasing precedence, in
+@ref{Precedence,  , Operator Precedence (How Operators Nest)}.)
+
+If the field number you compute is zero, you get the entire record.
+Thus, @code{$(2-2)} has the same value as @code{$0}.  Negative field
+numbers are not allowed; trying to reference one will usually terminate
+your running @code{awk} program.  (The POSIX standard does not define
+what happens when you reference a negative field number.  @code{gawk}
+will notice this and terminate your program.  Other @code{awk}
+implementations may behave differently.)
+
+As mentioned in @ref{Fields, ,Examining Fields},
+the number of fields in the current record is stored in the built-in
+variable @code{NF} (also @pxref{Built-in Variables}).  The expression
+@code{$NF} is not a special feature: it is the direct consequence of
+evaluating @code{NF} and using its value as a field number.
+
+@node Changing Fields, Field Separators, Non-Constant Fields, Reading Files
+@section Changing the Contents of a Field
+
+@cindex field, changing contents of
+@cindex changing contents of a field
+@cindex assignment to fields
+You can change the contents of a field as seen by @code{awk} within an
+@code{awk} program; this changes what @code{awk} perceives as the
+current input record.  (The actual input is untouched; @code{awk} @emph{never}
+modifies the input file.)
+
+Consider this example and its output:
+
+@example
+@group
+$ awk '@{ $3 = $2 - 10; print $2, $3 @}' inventory-shipped
+@print{} 13 3
+@print{} 15 5
+@print{} 15 5
+@dots{}
+@end group
+@end example
+
+@noindent
+The @samp{-} sign represents subtraction, so this program reassigns
+field three, @code{$3}, to be the value of field two minus ten,
+@samp{$2 - 10}.  (@xref{Arithmetic Ops, ,Arithmetic Operators}.)
+Then field two, and the new value for field three, are printed.  
+
+In order for this to work, the text in field @code{$2} must make sense
+as a number; the string of characters must be converted to a number in
+order for the computer to do arithmetic on it.  The number resulting
+from the subtraction is converted back to a string of characters which
+then becomes field three.
+@xref{Conversion, ,Conversion of Strings and Numbers}.
+
+When you change the value of a field (as perceived by @code{awk}), the
+text of the input record is recalculated to contain the new field where
+the old one was.  Therefore, @code{$0} changes to reflect the altered
+field.  Thus, this program
+prints a copy of the input file, with 10 subtracted from the second
+field of each line.
+
+@example
+@group
+$ awk '@{ $2 = $2 - 10; print $0 @}' inventory-shipped
+@print{} Jan 3 25 15 115
+@print{} Feb 5 32 24 226
+@print{} Mar 5 24 34 228
+@dots{}
+@end group
+@end example
+
+You can also assign contents to fields that are out of range.  For
+example:
+
+@example
+$ awk '@{ $6 = ($5 + $4 + $3 + $2)
+>        print $6 @}' inventory-shipped
+@print{} 168
+@print{} 297
+@print{} 301
+@dots{}
+@end example
+
+@noindent
+We've just created @code{$6}, whose value is the sum of fields
+@code{$2}, @code{$3}, @code{$4}, and @code{$5}.  The @samp{+} sign
+represents addition.  For the file @file{inventory-shipped}, @code{$6}
+represents the total number of parcels shipped for a particular month.
+
+Creating a new field changes @code{awk}'s internal copy of the current
+input record---the value of @code{$0}.  Thus, if you do @samp{print $0}
+after adding a field, the record printed includes the new field, with
+the appropriate number of field separators between it and the previously
+existing fields.
+
+This recomputation affects and is affected by
+@code{NF} (the number of fields; @pxref{Fields, ,Examining Fields}),
+and by a feature that has not been discussed yet,
+the @dfn{output field separator}, @code{OFS},
+which is used to separate the fields (@pxref{Output Separators}).
+For example, the value of @code{NF} is set to the number of the highest
+field you create.
+
+Note, however, that merely @emph{referencing} an out-of-range field
+does @emph{not} change the value of either @code{$0} or @code{NF}.
+Referencing an out-of-range field only produces an empty string.  For
+example:
+
+@example
+if ($(NF+1) != "")
+    print "can't happen"
+else
+    print "everything is normal"
+@end example
+
+@noindent
+should print @samp{everything is normal}, because @code{NF+1} is certain
+to be out of range.  (@xref{If Statement, ,The @code{if}-@code{else} Statement},
+for more information about @code{awk}'s @code{if-else} statements.
+@xref{Typing and Comparison, ,Variable Typing and Comparison Expressions}, for more information
+about the @samp{!=} operator.)
+
+It is important to note that making an assignment to an existing field
+will change the
+value of @code{$0}, but will not change the value of @code{NF},
+even when you assign the empty string to a field.  For example:
+
+@example
+@group
+$ echo a b c d | awk '@{ OFS = ":"; $2 = ""
+>                       print $0; print NF @}'
+@print{} a::c:d
+@print{} 4
+@end group
+@end example
+
+@noindent
+The field is still there; it just has an empty value.  You can tell
+because there are two colons in a row.
+
+This example shows what happens if you create a new field.
+
+@example
+$ echo a b c d | awk '@{ OFS = ":"; $2 = ""; $6 = "new"
+>                       print $0; print NF @}'
+@print{} a::c:d::new
+@print{} 6
+@end example
+
+@noindent
+The intervening field, @code{$5} is created with an empty value
+(indicated by the second pair of adjacent colons),
+and @code{NF} is updated with the value six.
+
+@node Field Separators, Constant Size, Changing Fields, Reading Files
+@section Specifying How Fields are Separated
+
+This section is rather long; it describes one of the most fundamental
+operations in @code{awk}.
+
+@menu
+* Basic Field Splitting::        How fields are split with single characters
+                                 or simple strings.
+* Regexp Field Splitting::       Using regexps as the field separator.
+* Single Character Fields::      Making each character a separate field.
+* Command Line Field Separator:: Setting @code{FS} from the command line.
+* Field Splitting Summary::      Some final points and a summary table.
+@end menu
+
+@node Basic Field Splitting, Regexp Field Splitting, Field Separators, Field Separators
+@subsection The Basics of Field Separating
+@vindex FS
+@cindex fields, separating
+@cindex field separator, @code{FS}
+
+The @dfn{field separator}, which is either a single character or a regular
+expression, controls the way @code{awk} splits an input record into fields.
+@code{awk} scans the input record for character sequences that
+match the separator; the fields themselves are the text between the matches.
+
+In the examples below, we use the bullet symbol ``@bullet{}'' to represent
+spaces in the output.
+
+If the field separator is @samp{oo}, then the following line:
+
+@example
+moo goo gai pan
+@end example
+
+@noindent
+would be split into three fields: @samp{m}, @samp{@bullet{}g} and
+@samp{@bullet{}gai@bullet{}pan}.
+Note the leading spaces in the values of the second and third fields.
+
+@cindex common mistakes
+@cindex mistakes, common
+@cindex errors, common
+The field separator is represented by the built-in variable @code{FS}.
+Shell programmers take note!  @code{awk} does @emph{not} use the name @code{IFS}
+which is used by the POSIX compatible shells (such as the Bourne shell,
+@code{sh}, or the GNU Bourne-Again Shell, Bash).
+
+You can change the value of @code{FS} in the @code{awk} program with the
+assignment operator, @samp{=} (@pxref{Assignment Ops, ,Assignment Expressions}).
+Often the right time to do this is at the beginning of execution,
+before any input has been processed, so that the very first record
+will be read with the proper separator.  To do this, use the special
+@code{BEGIN} pattern
+(@pxref{BEGIN/END, ,The @code{BEGIN} and @code{END} Special Patterns}).
+For example, here we set the value of @code{FS} to the string
+@code{","}:
+
+@example
+awk 'BEGIN @{ FS = "," @} ; @{ print $2 @}'
+@end example
+
+@noindent
+Given the input line,
+
+@example
+John Q. Smith, 29 Oak St., Walamazoo, MI 42139
+@end example
+
+@noindent
+this @code{awk} program extracts and prints the string
+@samp{@bullet{}29@bullet{}Oak@bullet{}St.}.
+
+@cindex field separator, choice of
+@cindex regular expressions as field separators
+Sometimes your input data will contain separator characters that don't
+separate fields the way you thought they would.  For instance, the
+person's name in the example we just used might have a title or
+suffix attached, such as @samp{John Q. Smith, LXIX}.  From input
+containing such a name:
+
+@example
+John Q. Smith, LXIX, 29 Oak St., Walamazoo, MI 42139
+@end example
+
+@noindent
+@c careful of an overfull hbox here!
+the above program would extract @samp{@bullet{}LXIX}, instead of
+@samp{@bullet{}29@bullet{}Oak@bullet{}St.}.
+If you were expecting the program to print the
+address, you would be surprised.  The moral is: choose your data layout and
+separator characters carefully to prevent such problems.
+
+@iftex
+As you know, normally,
+@end iftex
+@ifinfo
+Normally,
+@end ifinfo
+fields are separated by whitespace sequences
+(spaces and tabs), not by single spaces: two spaces in a row do not
+delimit an empty field.  The default value of the field separator @code{FS}
+is a string containing a single space, @w{@code{" "}}.  If this value were
+interpreted in the usual way, each space character would separate
+fields, so two spaces in a row would make an empty field between them.
+The reason this does not happen is that a single space as the value of
+@code{FS} is a special case: it is taken to specify the default manner
+of delimiting fields.
+
+If @code{FS} is any other single character, such as @code{","}, then
+each occurrence of that character separates two fields.  Two consecutive
+occurrences delimit an empty field.  If the character occurs at the
+beginning or the end of the line, that too delimits an empty field.  The
+space character is the only single character which does not follow these
+rules.
+
+@node Regexp Field Splitting, Single Character Fields, Basic Field Splitting, Field Separators
+@subsection Using Regular Expressions to Separate Fields
+
+The previous
+@iftex
+subsection
+@end iftex
+@ifinfo
+node
+@end ifinfo
+discussed the use of single characters or simple strings as the
+value of @code{FS}.
+More generally, the value of @code{FS} may be a string containing any
+regular expression.  In this case, each match in the record for the regular
+expression separates fields.  For example, the assignment:
+
+@example
+FS = ", \t"
+@end example
+
+@noindent
+makes every area of an input line that consists of a comma followed by a
+space and a tab, into a field separator.  (@samp{\t}
+is an @dfn{escape sequence} that stands for a tab;
+@pxref{Escape Sequences},
+for the complete list of similar escape sequences.)
+
+For a less trivial example of a regular expression, suppose you want
+single spaces to separate fields the way single commas were used above.
+You can set @code{FS} to @w{@code{"[@ ]"}} (left bracket, space, right
+bracket).  This regular expression matches a single space and nothing else
+(@pxref{Regexp, ,Regular Expressions}).
+
+There is an important difference between the two cases of @samp{FS = @w{" "}}
+(a single space) and @samp{FS = @w{"[ \t]+"}} (left bracket, space, backslash,
+``t'', right bracket, which is a regular expression
+matching one or more spaces or tabs).  For both values of @code{FS}, fields
+are separated by runs of spaces and/or tabs.  However, when the value of
+@code{FS} is @w{@code{" "}}, @code{awk} will first strip leading and trailing
+whitespace from the record, and then decide where the fields are.  
+
+For example, the following pipeline prints @samp{b}:
+
+@example
+$ echo ' a b c d ' | awk '@{ print $2 @}'
+@print{} b
+@end example
+
+@noindent
+However, this pipeline prints @samp{a} (note the extra spaces around
+each letter):
+
+@example
+$ echo ' a  b  c  d ' | awk 'BEGIN @{ FS = "[ \t]+" @}
+>                                  @{ print $2 @}'
+@print{} a
+@end example
+
+@noindent
+@cindex null string
+@cindex empty string
+In this case, the first field is @dfn{null}, or empty.
+
+The stripping of leading and trailing whitespace also comes into
+play whenever @code{$0} is recomputed.  For instance, study this pipeline:
+
+@example
+$ echo '   a b c d' | awk '@{ print; $2 = $2; print @}'
+@print{}    a b c d
+@print{} a b c d
+@end example
+
+@noindent
+The first @code{print} statement prints the record as it was read,
+with leading whitespace intact.  The assignment to @code{$2} rebuilds
+@code{$0} by concatenating @code{$1} through @code{$NF} together,
+separated by the value of @code{OFS}.  Since the leading whitespace
+was ignored when finding @code{$1}, it is not part of the new @code{$0}.
+Finally, the last @code{print} statement prints the new @code{$0}.
+
+@node Single Character Fields, Command Line Field Separator, Regexp Field Splitting, Field Separators
+@subsection Making Each Character a Separate Field
+
+@cindex differences between @code{gawk} and @code{awk}
+@cindex single character fields
+There are times when you may want to examine each character
+of a record separately.  In @code{gawk}, this is easy to do, you
+simply assign the null string (@code{""}) to @code{FS}. In this case,
+each individual character in the record will become a separate field.
+Here is an example:
+@c extra verbiage due to page boundaries
+
+@example
+echo a b | gawk 'BEGIN @{ FS = "" @}
+                 @{ 
+                     for (i = 1; i <= NF; i = i + 1)
+                         print "Field", i, "is", $i
+                 @}'
+@end example
+
+@noindent
+The output from this is:
+
+@example
+Field 1 is a
+Field 2 is
+Field 3 is b
+@end example
+
+@cindex dark corner
+Traditionally, the behavior for @code{FS} equal to @code{""} was not defined.
+In this case, Unix @code{awk} would simply treat the entire record
+as only having one field (d.c.).  In compatibility mode
+(@pxref{Options, ,Command Line Options}),
+if @code{FS} is the null string, then @code{gawk} will also
+behave this way.
+
+@node Command Line Field Separator, Field Splitting Summary, Single Character Fields, Field Separators
+@subsection Setting @code{FS} from the Command Line
+@cindex @code{-F} option
+@cindex field separator, on command line
+@cindex command line, setting @code{FS} on
+
+@code{FS} can be set on the command line.  You use the @samp{-F} option to
+do so.  For example:
+
+@example
+awk -F, '@var{program}' @var{input-files}
+@end example
+
+@noindent
+sets @code{FS} to be the @samp{,} character.  Notice that the option uses
+a capital @samp{F}.  Contrast this with @samp{-f}, which specifies a file
+containing an @code{awk} program.  Case is significant in command line options:
+the @samp{-F} and @samp{-f} options have nothing to do with each other.
+You can use both options at the same time to set the @code{FS} variable
+@emph{and} get an @code{awk} program from a file.
+
+The value used for the argument to @samp{-F} is processed in exactly the
+same way as assignments to the built-in variable @code{FS}.  This means that
+if the field separator contains special characters, they must be escaped
+appropriately.  For example, to use a @samp{\} as the field separator, you
+would have to type:
+
+@example
+# same as FS = "\\" 
+awk -F\\\\ '@dots{}' files @dots{}
+@end example
+
+@noindent
+Since @samp{\} is used for quoting in the shell, @code{awk} will see
+@samp{-F\\}.  Then @code{awk} processes the @samp{\\} for escape
+characters (@pxref{Escape Sequences}), finally yielding
+a single @samp{\} to be used for the field separator.
+
+@cindex historical features
+As a special case, in compatibility mode
+(@pxref{Options, ,Command Line Options}), if the
+argument to @samp{-F} is @samp{t}, then @code{FS} is set to the tab
+character.  This is because if you type @samp{-F\t} at the shell,
+without any quotes, the @samp{\} gets deleted, so @code{awk} figures that you
+really want your fields to be separated with tabs, and not @samp{t}s.
+Use @samp{-v FS="t"} on the command line if you really do want to separate
+your fields with @samp{t}s
+(@pxref{Options, ,Command Line Options}).
+
+For example, let's use an @code{awk} program file called @file{baud.awk}
+that contains the pattern @code{/300/}, and the action @samp{print $1}.
+Here is the program:
+
+@example
+/300/   @{ print $1 @}
+@end example
+
+Let's also set @code{FS} to be the @samp{-} character, and run the
+program on the file @file{BBS-list}.  The following command prints a
+list of the names of the bulletin boards that operate at 300 baud and
+the first three digits of their phone numbers:
+
+@c tweaked to make the tex output look better in @smallbook
+@example
+@group
+$ awk -F- -f baud.awk BBS-list
+@print{} aardvark     555
+@print{} alpo
+@print{} barfly       555
+@dots{}
+@end group
+@ignore
+@print{} bites        555
+@print{} camelot      555
+@print{} core         555
+@print{} fooey        555
+@print{} foot         555
+@print{} macfoo       555
+@print{} sdace        555
+@print{} sabafoo      555
+@end ignore
+@end example
+
+@noindent
+Note the second line of output.  In the original file
+(@pxref{Sample Data Files, ,Data Files for the Examples}),
+the second line looked like this:
+
+@example
+alpo-net     555-3412     2400/1200/300     A
+@end example
+
+The @samp{-} as part of the system's name was used as the field
+separator, instead of the @samp{-} in the phone number that was
+originally intended.  This demonstrates why you have to be careful in
+choosing your field and record separators.
+
+On many Unix systems, each user has a separate entry in the system password
+file, one line per user.  The information in these lines is separated
+by colons.  The first field is the user's logon name, and the second is
+the user's encrypted password.  A password file entry might look like this:
+
+@example
+arnold:xyzzy:2076:10:Arnold Robbins:/home/arnold:/bin/sh
+@end example
+
+The following program searches the system password file, and prints
+the entries for users who have no password:
+
+@example
+awk -F: '$2 == ""' /etc/passwd
+@end example
+
+@node Field Splitting Summary,  , Command Line Field Separator, Field Separators
+@subsection Field Splitting Summary
+
+@cindex @code{awk} language, POSIX version
+@cindex POSIX @code{awk}
+According to the POSIX standard, @code{awk} is supposed to behave
+as if each record is split into fields at the time that it is read.
+In particular, this means that you can change the value of @code{FS}
+after a record is read, and the value of the fields (i.e.@: how they were split)
+should reflect the old value of @code{FS}, not the new one.
+
+@cindex dark corner
+@cindex @code{sed} utility
+@cindex stream editor
+However, many implementations of @code{awk} do not work this way.  Instead,
+they defer splitting the fields until a field is actually
+referenced.  The fields will be split
+using the @emph{current} value of @code{FS}! (d.c.)
+This behavior can be difficult
+to diagnose. The following example illustrates the difference
+between the two methods.
+(The @code{sed}@footnote{The @code{sed} utility is a ``stream editor.''
+Its behavior is also defined by the POSIX standard.}
+command prints just the first line of @file{/etc/passwd}.)
+
+@example
+sed 1q /etc/passwd | awk '@{ FS = ":" ; print $1 @}'
+@end example
+
+@noindent
+will usually print
+
+@example
+root
+@end example
+
+@noindent
+on an incorrect implementation of @code{awk}, while @code{gawk}
+will print something like
+
+@example
+root:nSijPlPhZZwgE:0:0:Root:/:
+@end example
+
+The following table summarizes how fields are split, based on the
+value of @code{FS}. (@samp{==} means ``is equal to.'')
+
+@c @cartouche
+@table @code
+@item FS == " "
+Fields are separated by runs of whitespace.  Leading and trailing
+whitespace are ignored.  This is the default.
+
+@item FS == @var{any other single character}
+Fields are separated by each occurrence of the character.  Multiple
+successive occurrences delimit empty fields, as do leading and
+trailing occurrences.
+The character can even be a regexp metacharacter; it does not need
+to be escaped.
+
+@item FS == @var{regexp}
+Fields are separated by occurrences of characters that match @var{regexp}.
+Leading and trailing matches of @var{regexp} delimit empty fields.
+
+@item FS == ""
+Each individual character in the record becomes a separate field.
+@end table
+@c @end cartouche
+
+@node Constant Size, Multiple Line, Field Separators, Reading Files
+@section Reading Fixed-width Data
+
+(This section discusses an advanced, experimental feature.  If you are
+a novice @code{awk} user, you may wish to skip it on the first reading.)
+
+@code{gawk} version 2.13 introduced a new facility for dealing with
+fixed-width fields with no distinctive field separator.  Data of this
+nature arises, for example, in  the input for old FORTRAN programs where
+numbers are run together; or in the output of programs that did not
+anticipate the use of their output as input for other programs.
+
+An example of the latter is a table where all the columns are lined up by
+the use of a variable number of spaces and @emph{empty fields are just
+spaces}.  Clearly, @code{awk}'s normal field splitting based on @code{FS}
+will not work well in this case.  Although a portable @code{awk} program
+can use a series of @code{substr} calls on @code{$0}
+(@pxref{String Functions, ,Built-in Functions for String Manipulation}),
+this is awkward and inefficient for a large number of fields.
+
+The splitting of an input record into fixed-width fields is specified by
+assigning a string containing space-separated numbers to the built-in
+variable @code{FIELDWIDTHS}.  Each number specifies the width of the field
+@emph{including} columns between fields.  If you want to ignore the columns
+between fields, you can specify the width as a separate field that is
+subsequently ignored.
+
+The following data is the output of the Unix @code{w} utility.  It is useful
+to illustrate the use of @code{FIELDWIDTHS}.
+
+@example
+@group
+ 10:06pm  up 21 days, 14:04,  23 users
+User     tty       login@  idle   JCPU   PCPU  what
+hzuo     ttyV0     8:58pm            9      5  vi p24.tex 
+hzang    ttyV3     6:37pm    50                -csh 
+eklye    ttyV5     9:53pm            7      1  em thes.tex 
+dportein ttyV6     8:17pm  1:47                -csh 
+gierd    ttyD3    10:00pm     1                elm 
+dave     ttyD4     9:47pm            4      4  w 
+brent    ttyp0    26Jun91  4:46  26:46   4:41  bash 
+dave     ttyq4    26Jun9115days     46     46  wnewmail
+@end group 
+@end example
+
+The following program takes the above input, converts the idle time to
+number of seconds and prints out the first two fields and the calculated
+idle time.  (This program uses a number of @code{awk} features that
+haven't been introduced yet.)
+
+@example
+@group
+BEGIN  @{ FIELDWIDTHS = "9 6 10 6 7 7 35" @}
+NR > 2 @{
+    idle = $4
+    sub(/^  */, "", idle)   # strip leading spaces
+    if (idle == "")
+        idle = 0
+    if (idle ~ /:/) @{
+        split(idle, t, ":")
+        idle = t[1] * 60 + t[2]
+    @}
+    if (idle ~ /days/)
+        idle *= 24 * 60 * 60
+ 
+    print $1, $2, idle
+@}
+@end group
+@end example
+
+Here is the result of running the program on the data:
+
+@example
+hzuo      ttyV0  0
+hzang     ttyV3  50
+eklye     ttyV5  0
+dportein  ttyV6  107
+gierd     ttyD3  1
+dave      ttyD4  0
+brent     ttyp0  286
+dave      ttyq4  1296000
+@end example
+
+Another (possibly more practical) example of fixed-width input data
+would be the input from a deck of balloting cards.  In some parts of
+the United States, voters mark their choices by punching holes in computer
+cards.  These cards are then processed to count the votes for any particular
+candidate or on any particular issue.  Since a voter may choose not to
+vote on some issue, any column on the card may be empty.  An @code{awk}
+program for processing such data could use the @code{FIELDWIDTHS} feature
+to simplify reading the data.  (Of course, getting @code{gawk} to run on
+a system with card readers is another story!)
+
+@ignore
+Exercise: Write a ballot card reading program
+@end ignore
+
+Assigning a value to @code{FS} causes @code{gawk} to return to using
+@code{FS} for field splitting.  Use @samp{FS = FS} to make this happen,
+without having to know the current value of @code{FS}.
+
+This feature is still experimental, and may evolve over time.
+Note that in particular, @code{gawk} does not attempt to verify
+the sanity of the values used in the value of @code{FIELDWIDTHS}.
+
+@node Multiple Line, Getline, Constant Size, Reading Files
+@section Multiple-Line Records
+
+@cindex multiple line records
+@cindex input, multiple line records
+@cindex reading files, multiple line records
+@cindex records, multiple line
+In some data bases, a single line cannot conveniently hold all the
+information in one entry.  In such cases, you can use multi-line
+records.
+
+The first step in doing this is to choose your data format: when records
+are not defined as single lines, how do you want to define them?
+What should separate records?
+
+One technique is to use an unusual character or string to separate
+records.  For example, you could use the formfeed character (written
+@samp{\f} in @code{awk}, as in C) to separate them, making each record
+a page of the file.  To do this, just set the variable @code{RS} to
+@code{"\f"} (a string containing the formfeed character).  Any
+other character could equally well be used, as long as it won't be part
+of the data in a record.
+
+Another technique is to have blank lines separate records.  By a special
+dispensation, an empty string as the value of @code{RS} indicates that
+records are separated by one or more blank lines.  If you set @code{RS}
+to the empty string, a record always ends at the first blank line
+encountered.  And the next record doesn't start until the first non-blank
+line that follows---no matter how many blank lines appear in a row, they
+are considered one record-separator.
+
+@cindex leftmost longest match
+@cindex matching, leftmost longest
+You can achieve the same effect as @samp{RS = ""} by assigning the
+string @code{"\n\n+"} to @code{RS}. This regexp matches the newline
+at the end of the record, and one or more blank lines after the record.
+In addition, a regular expression always matches the longest possible
+sequence when there is a choice
+(@pxref{Leftmost Longest, ,How Much Text Matches?})
+So the next record doesn't start until
+the first non-blank line that follows---no matter how many blank lines
+appear in a row, they are considered one record-separator.
+
+@cindex dark corner
+There is an important difference between @samp{RS = ""} and
+@samp{RS = "\n\n+"}. In the first case, leading newlines in the input
+data file are ignored, and if a file ends without extra blank lines
+after the last record, the final newline is removed from the record.
+In the second case, this special processing is not done (d.c.).
+
+Now that the input is separated into records, the second step is to
+separate the fields in the record.  One way to do this is to divide each
+of the lines into fields in the normal manner.  This happens by default
+as the result of a special feature: when @code{RS} is set to the empty
+string, the newline character @emph{always} acts as a field separator.
+This is in addition to whatever field separations result from @code{FS}.
+
+The original motivation for this special exception was probably to provide
+useful behavior in the default case (i.e.@: @code{FS} is equal
+to @w{@code{" "}}).  This feature can be a problem if you really don't
+want the newline character to separate fields, since there is no way to
+prevent it.  However, you can work around this by using the @code{split}
+function to break up the record manually
+(@pxref{String Functions, ,Built-in Functions for String Manipulation}).
+
+Another way to separate fields is to
+put each field on a separate line: to do this, just set the
+variable @code{FS} to the string @code{"\n"}.  (This simple regular
+expression matches a single newline.)
+
+A practical example of a data file organized this way might be a mailing
+list, where each entry is separated by blank lines.  If we have a mailing
+list in a file named @file{addresses}, that looks like this:
+
+@example
+Jane Doe
+123 Main Street
+Anywhere, SE 12345-6789
+
+John Smith
+456 Tree-lined Avenue
+Smallville, MW 98765-4321
+
+@dots{}
+@end example
+
+@noindent
+A simple program to process this file would look like this:
+
+@example
+@group
+# addrs.awk --- simple mailing list program
+
+# Records are separated by blank lines.
+# Each line is one field.
+BEGIN @{ RS = "" ; FS = "\n" @}
+
+@{
+      print "Name is:", $1
+      print "Address is:", $2
+      print "City and State are:", $3
+      print ""
+@}
+@end group
+@end example
+
+Running the program produces the following output:
+
+@example
+@group
+$ awk -f addrs.awk addresses
+@print{} Name is: Jane Doe
+@print{} Address is: 123 Main Street
+@print{} City and State are: Anywhere, SE 12345-6789
+@print{} 
+@end group
+@group
+@print{} Name is: John Smith
+@print{} Address is: 456 Tree-lined Avenue
+@print{} City and State are: Smallville, MW 98765-4321
+@print{} 
+@dots{}
+@end group
+@end example
+
+@xref{Labels Program, ,Printing Mailing Labels}, for a more realistic
+program that deals with address lists.
+
+The following table summarizes how records are split, based on the
+value of @code{RS}. (@samp{==} means ``is equal to.'')
+
+@c @cartouche
+@table @code
+@item RS == "\n"
+Records are separated by the newline character (@samp{\n}).  In effect,
+every line in the data file is a separate record, including blank lines.
+This is the default.
+
+@item RS == @var{any single character}
+Records are separated by each occurrence of the character.  Multiple
+successive occurrences delimit empty records.
+
+@item RS == ""
+Records are separated by runs of blank lines.  The newline character
+always serves as a field separator, in addition to whatever value
+@code{FS} may have. Leading and trailing newlines in a file are ignored.
+
+@item RS == @var{regexp}
+Records are separated by occurrences of characters that match @var{regexp}.
+Leading and trailing matches of @var{regexp} delimit empty records.
+@end table
+@c @end cartouche
+
+@vindex RT
+In all cases, @code{gawk} sets @code{RT} to the input text that matched the
+value specified by @code{RS}.
+
+@node Getline, , Multiple Line, Reading Files
+@section Explicit Input with @code{getline}
+
+@findex getline
+@cindex input, explicit
+@cindex explicit input
+@cindex input, @code{getline} command
+@cindex reading files, @code{getline} command
+So far we have been getting our input data from @code{awk}'s main
+input stream---either the standard input (usually your terminal, sometimes
+the output from another program) or from the
+files specified on the command line.  The @code{awk} language has a
+special built-in command called @code{getline} that
+can be used to read input under your explicit control.
+
+@menu
+* Getline Intro::            Introduction to the @code{getline} function.
+* Plain Getline::            Using @code{getline} with no arguments.
+* Getline/Variable::         Using @code{getline} into a variable.
+* Getline/File::             Using @code{getline} from a file.
+* Getline/Variable/File::       Using @code{getline} into a variable from a
+                                file.
+* Getline/Pipe::             Using @code{getline} from a pipe.
+* Getline/Variable/Pipe::       Using @code{getline} into a variable from a
+                                pipe.
+* Getline Summary::          Summary Of @code{getline} Variants.
+@end menu
+
+@node Getline Intro, Plain Getline, Getline, Getline
+@subsection Introduction to @code{getline}
+
+This command is used in several different ways, and should @emph{not} be
+used by beginners.  It is covered here because this is the chapter on input.
+The examples that follow the explanation of the @code{getline} command
+include material that has not been covered yet.  Therefore, come back
+and study the @code{getline} command @emph{after} you have reviewed the
+rest of this @value{DOCUMENT} and have a good knowledge of how @code{awk} works.
+
+@vindex ERRNO
+@cindex differences between @code{gawk} and @code{awk}
+@cindex @code{getline}, return values
+@code{getline} returns one if it finds a record, and zero if the end of the
+file is encountered.  If there is some error in getting a record, such
+as a file that cannot be opened, then @code{getline} returns @minus{}1.
+In this case, @code{gawk} sets the variable @code{ERRNO} to a string
+describing the error that occurred.
+
+In the following examples, @var{command} stands for a string value that
+represents a shell command.
+
+@node Plain Getline, Getline/Variable, Getline Intro, Getline
+@subsection Using @code{getline} with No Arguments
+
+The @code{getline} command can be used without arguments to read input
+from the current input file.  All it does in this case is read the next
+input record and split it up into fields.  This is useful if you've
+finished processing the current record, but you want to do some special
+processing @emph{right now} on the next record.  Here's an
+example:
+
+@example
+@group
+awk '@{
+     if ((t = index($0, "/*")) != 0) @{
+          # value will be "" if t is 1
+          tmp = substr($0, 1, t - 1)
+          u = index(substr($0, t + 2), "*/")
+          while (u == 0) @{
+               if (getline <= 0) @{
+                    m = "unexpected EOF or error"
+                    m = (m ": " ERRNO)
+                    print m > "/dev/stderr"
+                    exit
+               @}
+               t = -1
+               u = index($0, "*/")
+          @}
+@end group
+@group
+          # substr expression will be "" if */
+          # occurred at end of line
+          $0 = tmp substr($0, t + u + 3)
+     @}
+     print $0
+@}'
+@end group
+@end example
+
+This @code{awk} program deletes all C-style comments, @samp{/* @dots{}
+*/}, from the input.  By replacing the @samp{print $0} with other
+statements, you could perform more complicated processing on the
+decommented input, like searching for matches of a regular
+expression.  This program has a subtle problem---it does not work if one
+comment ends and another begins on the same line.
+
+@ignore
+Exercise,
+write a program that does handle multiple comments on the line.
+@end ignore
+
+This form of the @code{getline} command sets @code{NF} (the number of
+fields; @pxref{Fields, ,Examining Fields}), @code{NR} (the number of
+records read so far; @pxref{Records, ,How Input is Split into Records}),
+@code{FNR} (the number of records read from this input file), and the
+value of @code{$0}.
+
+@cindex dark corner
+@strong{Note:} the new value of @code{$0} is used in testing
+the patterns of any subsequent rules.  The original value
+of @code{$0} that triggered the rule which executed @code{getline}
+is lost (d.c.).
+By contrast, the @code{next} statement reads a new record
+but immediately begins processing it normally, starting with the first
+rule in the program.  @xref{Next Statement, ,The @code{next} Statement}.
+
+@node Getline/Variable, Getline/File, Plain Getline, Getline
+@subsection Using @code{getline} Into a Variable
+
+You can use @samp{getline @var{var}} to read the next record from
+@code{awk}'s input into the variable @var{var}.  No other processing is
+done.
+
+For example, suppose the next line is a comment, or a special string,
+and you want to read it, without triggering
+any rules.  This form of @code{getline} allows you to read that line
+and store it in a variable so that the main
+read-a-line-and-check-each-rule loop of @code{awk} never sees it.
+
+The following example swaps every two lines of input.  For example, given:
+
+@example
+wan
+tew
+free
+phore
+@end example
+
+@noindent
+it outputs:
+
+@example
+tew
+wan
+phore
+free
+@end example
+
+@noindent
+Here's the program:
+
+@example
+@group
+awk '@{
+     if ((getline tmp) > 0) @{
+          print tmp
+          print $0
+     @} else
+          print $0
+@}'
+@end group
+@end example
+
+The @code{getline} command used in this way sets only the variables
+@code{NR} and @code{FNR} (and of course, @var{var}).  The record is not
+split into fields, so the values of the fields (including @code{$0}) and
+the value of @code{NF} do not change.
+
+@node Getline/File, Getline/Variable/File, Getline/Variable, Getline
+@subsection Using @code{getline} from a File
+
+@cindex input redirection
+@cindex redirection of input
+Use @samp{getline < @var{file}} to read
+the next record from the file
+@var{file}.  Here @var{file} is a string-valued expression that
+specifies the file name.  @samp{< @var{file}} is called a @dfn{redirection}
+since it directs input to come from a different place.
+
+For example, the following
+program reads its input record from the file @file{secondary.input} when it
+encounters a first field with a value equal to 10 in the current input
+file.
+
+@example
+@group
+awk '@{
+    if ($1 == 10) @{
+         getline < "secondary.input"
+         print
+    @} else
+         print
+@}'
+@end group
+@end example
+
+Since the main input stream is not used, the values of @code{NR} and
+@code{FNR} are not changed.  But the record read is split into fields in
+the normal manner, so the values of @code{$0} and other fields are
+changed.  So is the value of @code{NF}.
+
+@node Getline/Variable/File, Getline/Pipe, Getline/File, Getline
+@subsection Using @code{getline} Into a Variable from a File
+
+Use @samp{getline @var{var} < @var{file}} to read input
+the file
+@var{file} and put it in the variable @var{var}.  As above, @var{file}
+is a string-valued expression that specifies the file from which to read.
+
+In this version of @code{getline}, none of the built-in variables are
+changed, and the record is not split into fields.  The only variable
+changed is @var{var}.
+
+For example, the following program copies all the input files to the
+output, except for records that say @w{@samp{@@include @var{filename}}}.
+Such a record is replaced by the contents of the file
+@var{filename}.
+
+@example
+@group
+awk '@{
+     if (NF == 2 && $1 == "@@include") @{
+          while ((getline line < $2) > 0)
+               print line
+          close($2)
+     @} else
+          print
+@}'
+@end group
+@end example
+
+Note here how the name of the extra input file is not built into
+the program; it is taken directly from the data, from the second field on
+the @samp{@@include} line.
+
+The @code{close} function is called to ensure that if two identical
+@samp{@@include} lines appear in the input, the entire specified file is
+included twice.
+@xref{Close Files And Pipes, ,Closing Input and Output Files and Pipes}.
+
+One deficiency of this program is that it does not process nested
+@samp{@@include} statements
+(@samp{@@include} statements in included files)
+the way a true macro preprocessor would.
+@xref{Igawk Program, ,An Easy Way to Use Library Functions}, for a program
+that does handle nested @samp{@@include} statements.
+
+@node Getline/Pipe, Getline/Variable/Pipe, Getline/Variable/File, Getline
+@subsection Using @code{getline} from a Pipe
+
+@cindex input pipeline
+@cindex pipeline, input
+You can pipe the output of a command into @code{getline}, using
+@samp{@var{command} | getline}.  In
+this case, the string @var{command} is run as a shell command and its output
+is piped into @code{awk} to be used as input.  This form of @code{getline}
+reads one record at a time from the pipe.
+
+For example, the following program copies its input to its output, except for
+lines that begin with @samp{@@execute}, which are replaced by the output
+produced by running the rest of the line as a shell command:
+
+@example
+@group
+awk '@{
+     if ($1 == "@@execute") @{
+          tmp = substr($0, 10)
+          while ((tmp | getline) > 0)
+               print
+          close(tmp)
+     @} else
+          print
+@}'
+@end group
+@end example
+
+@noindent
+The @code{close} function is called to ensure that if two identical
+@samp{@@execute} lines appear in the input, the command is run for
+each one.
+@xref{Close Files And Pipes, ,Closing Input and Output Files and Pipes}.
+@c Exercise!!
+@c This example is unrealistic, since you could just use system
+
+Given the input:
+
+@example
+@group
+foo
+bar
+baz
+@@execute who
+bletch
+@end group
+@end example
+
+@noindent
+the program might produce:
+
+@example
+@group
+foo
+bar
+baz
+arnold     ttyv0   Jul 13 14:22
+miriam     ttyp0   Jul 13 14:23     (murphy:0)
+bill       ttyp1   Jul 13 14:23     (murphy:0)
+bletch
+@end group
+@end example
+
+@noindent
+Notice that this program ran the command @code{who} and printed the result.
+(If you try this program yourself, you will of course get different results,
+showing you who is logged in on your system.)
+
+This variation of @code{getline} splits the record into fields, sets the
+value of @code{NF} and recomputes the value of @code{$0}.  The values of
+@code{NR} and @code{FNR} are not changed.
+
+@node Getline/Variable/Pipe, Getline Summary, Getline/Pipe, Getline
+@subsection Using @code{getline} Into a Variable from a Pipe
+
+When you use @samp{@var{command} | getline @var{var}}, the
+output of the command @var{command} is sent through a pipe to
+@code{getline} and into the variable @var{var}.  For example, the
+following program reads the current date and time into the variable
+@code{current_time}, using the @code{date} utility, and then
+prints it.
+
+@example
+@group
+awk 'BEGIN @{
+     "date" | getline current_time
+     close("date")
+     print "Report printed on " current_time
+@}'
+@end group
+@end example
+
+In this version of @code{getline}, none of the built-in variables are
+changed, and the record is not split into fields.
+
+@node Getline Summary,  , Getline/Variable/Pipe, Getline
+@subsection Summary of @code{getline} Variants
+
+With all the forms of @code{getline}, even though @code{$0} and @code{NF},
+may be updated, the record will not be tested against all the patterns
+in the @code{awk} program, in the way that would happen if the record
+were read normally by the main processing loop of @code{awk}.  However
+the new record is tested against any subsequent rules.
+
+@cindex differences between @code{gawk} and @code{awk}
+@cindex limitations
+@cindex implementation limits
+Many @code{awk} implementations limit the number of pipelines an @code{awk}
+program may have open to just one!  In @code{gawk}, there is no such limit.
+You can open as many pipelines as the underlying operating system will
+permit.
+
+The following table summarizes the six variants of @code{getline},
+listing which built-in variables are set by each one.
+
+@iftex
+@page
+@end iftex
+@c @cartouche
+@table @code
+@item getline
+sets @code{$0}, @code{NF}, @code{FNR}, and @code{NR}.
+
+@item getline @var{var}
+sets @var{var}, @code{FNR}, and @code{NR}.
+
+@item getline < @var{file}
+sets @code{$0}, and @code{NF}.
+
+@item getline @var{var} < @var{file}
+sets @var{var}.
+
+@item @var{command} | getline
+sets @code{$0}, and @code{NF}.
+
+@item @var{command} | getline @var{var}
+sets @var{var}.
+@end table
+@c @end cartouche
+
+@node Printing, Expressions, Reading Files, Top
+@chapter Printing Output
+
+@cindex printing
+@cindex output
+One of the most common actions is to @dfn{print}, or output,
+some or all of the input.  You use the @code{print} statement
+for simple output.  You use the @code{printf} statement
+for fancier formatting.  Both are described in this chapter.
+
+@menu
+* Print::                       The @code{print} statement.
+* Print Examples::              Simple examples of @code{print} statements.
+* Output Separators::           The output separators and how to change them.
+* OFMT::                        Controlling Numeric Output With @code{print}.
+* Printf::                      The @code{printf} statement.
+* Redirection::                 How to redirect output to multiple files and
+                                pipes.
+* Special Files::               File name interpretation in @code{gawk}.
+                                @code{gawk} allows access to inherited file
+                                descriptors.
+* Close Files And Pipes::       Closing Input and Output Files and Pipes.
+@end menu
+
+@node Print, Print Examples, Printing, Printing
+@section The @code{print} Statement
+@cindex @code{print} statement
+
+The @code{print} statement does output with simple, standardized
+formatting.  You specify only the strings or numbers to be printed, in a
+list separated by commas.  They are output, separated by single spaces,
+followed by a newline.  The statement looks like this:
+
+@example
+print @var{item1}, @var{item2}, @dots{}
+@end example
+
+@noindent
+The entire list of items may optionally be enclosed in parentheses.  The
+parentheses are necessary if any of the item expressions uses the @samp{>}
+relational operator; otherwise it could be confused with a redirection
+(@pxref{Redirection, ,Redirecting Output of @code{print} and @code{printf}}).
+
+The items to be printed can be constant strings or numbers, fields of the
+current record (such as @code{$1}), variables, or any @code{awk}
+expressions.
+Numeric values are converted to strings, and then printed.
+
+The @code{print} statement is completely general for
+computing @emph{what} values to print. However, with two exceptions,
+you cannot specify @emph{how} to print them---how many
+columns, whether to use exponential notation or not, and so on.
+(For the exceptions, @pxref{Output Separators}, and
+@ref{OFMT, ,Controlling Numeric Output with @code{print}}.)
+For that, you need the @code{printf} statement
+(@pxref{Printf, ,Using @code{printf} Statements for Fancier Printing}).
+
+The simple statement @samp{print} with no items is equivalent to
+@samp{print $0}: it prints the entire current record.  To print a blank
+line, use @samp{print ""}, where @code{""} is the empty string.
+
+To print a fixed piece of text, use a string constant such as
+@w{@code{"Don't Panic"}} as one item.  If you forget to use the
+double-quote characters, your text will be taken as an @code{awk}
+expression, and you will probably get an error.  Keep in mind that a
+space is printed between any two items.
+
+Each @code{print} statement makes at least one line of output.  But it
+isn't limited to one line.  If an item value is a string that contains a
+newline, the newline is output along with the rest of the string.  A
+single @code{print} can make any number of lines this way.
+
+@node Print Examples, Output Separators, Print, Printing
+@section Examples of @code{print} Statements
+
+Here is an example of printing a string that contains embedded newlines
+(the @samp{\n} is an escape sequence, used to represent the newline
+character; see @ref{Escape Sequences}):
+
+@example
+@group
+$ awk 'BEGIN @{ print "line one\nline two\nline three" @}'
+@print{} line one
+@print{} line two
+@print{} line three
+@end group
+@end example
+
+Here is an example that prints the first two fields of each input record,
+with a space between them:
+
+@example
+@group
+$ awk '@{ print $1, $2 @}' inventory-shipped
+@print{} Jan 13
+@print{} Feb 15
+@print{} Mar 15
+@dots{}
+@end group
+@end example
+
+@cindex common mistakes
+@cindex mistakes, common
+@cindex errors, common
+A common mistake in using the @code{print} statement is to omit the comma
+between two items.  This often has the effect of making the items run
+together in the output, with no space.  The reason for this is that
+juxtaposing two string expressions in @code{awk} means to concatenate
+them.  Here is the same program, without the comma:
+
+@example
+@group
+$ awk '@{ print $1 $2 @}' inventory-shipped
+@print{} Jan13
+@print{} Feb15
+@print{} Mar15
+@dots{}
+@end group
+@end example
+
+To someone unfamiliar with the file @file{inventory-shipped}, neither
+example's output makes much sense.  A heading line at the beginning
+would make it clearer.  Let's add some headings to our table of months
+(@code{$1}) and green crates shipped (@code{$2}).  We do this using the
+@code{BEGIN} pattern
+(@pxref{BEGIN/END, ,The @code{BEGIN} and @code{END} Special Patterns})
+to force the headings to be printed only once:
+
+@example
+awk 'BEGIN @{  print "Month Crates"
+              print "----- ------" @}
+           @{  print $1, $2 @}' inventory-shipped
+@end example
+
+@noindent
+Did you already guess what happens? When run, the program prints
+the following:
+
+@example
+@group
+Month Crates
+----- ------
+Jan 13
+Feb 15
+Mar 15
+@dots{}
+@end group
+@end example
+
+@noindent
+The headings and the table data don't line up!  We can fix this by printing
+some spaces between the two fields:
+
+@example
+awk 'BEGIN @{ print "Month Crates"
+             print "----- ------" @}
+           @{ print $1, "     ", $2 @}' inventory-shipped
+@end example
+
+You can imagine that this way of lining up columns can get pretty
+complicated when you have many columns to fix.  Counting spaces for two
+or three columns can be simple, but more than this and you can get
+lost quite easily.  This is why the @code{printf} statement was
+created (@pxref{Printf, ,Using @code{printf} Statements for Fancier Printing});
+one of its specialties is lining up columns of data.
+
+@cindex line continuation
+As a side point,
+you can continue either a @code{print} or @code{printf} statement simply
+by putting a newline after any comma
+(@pxref{Statements/Lines, ,@code{awk} Statements Versus Lines}).
+
+@node Output Separators, OFMT, Print Examples, Printing
+@section Output Separators
+
+@cindex output field separator, @code{OFS}
+@cindex output record separator, @code{ORS}
+@vindex OFS
+@vindex ORS
+As mentioned previously, a @code{print} statement contains a list
+of items, separated by commas.  In the output, the items are normally
+separated by single spaces.  This need not be the case; a
+single space is only the default.  You can specify any string of
+characters to use as the @dfn{output field separator} by setting the
+built-in variable @code{OFS}.  The initial value of this variable
+is the string @w{@code{" "}}, that is, a single space.
+
+The output from an entire @code{print} statement is called an
+@dfn{output record}.  Each @code{print} statement outputs one output
+record and then outputs a string called the @dfn{output record separator}.
+The built-in variable @code{ORS} specifies this string.  The initial
+value of @code{ORS} is the string @code{"\n"}, i.e.@: a newline
+character; thus, normally each @code{print} statement makes a separate line.
+
+You can change how output fields and records are separated by assigning
+new values to the variables @code{OFS} and/or @code{ORS}.  The usual
+place to do this is in the @code{BEGIN} rule
+(@pxref{BEGIN/END, ,The @code{BEGIN} and @code{END} Special Patterns}), so
+that it happens before any input is processed.  You may also do this
+with assignments on the command line, before the names of your input
+files, or using the @samp{-v} command line option
+(@pxref{Options, ,Command Line Options}).
+
+@ignore
+Exercise,
+Rewrite the 
+@example
+awk 'BEGIN @{ print "Month Crates"
+             print "----- ------" @}
+           @{ print $1, "     ", $2 @}' inventory-shipped
+@end example
+program by using a new value of @code{OFS}.
+@end ignore
+
+The following example prints the first and second fields of each input
+record separated by a semicolon, with a blank line added after each
+line:
+
+@example
+@group
+$ awk 'BEGIN @{ OFS = ";"; ORS = "\n\n" @}
+>            @{ print $1, $2 @}' BBS-list
+@print{} aardvark;555-5553
+@print{} 
+@print{} alpo-net;555-3412
+@print{} 
+@print{} barfly;555-7685
+@dots{}
+@end group
+@end example
+
+If the value of @code{ORS} does not contain a newline, all your output
+will be run together on a single line, unless you output newlines some
+other way.
+
+@node OFMT, Printf, Output Separators, Printing
+@section Controlling Numeric Output with @code{print}
+@vindex OFMT
+@cindex numeric output format
+@cindex format, numeric output
+@cindex output format specifier, @code{OFMT}
+When you use the @code{print} statement to print numeric values,
+@code{awk} internally converts the number to a string of characters,
+and prints that string.  @code{awk} uses the @code{sprintf} function
+to do this conversion
+(@pxref{String Functions, ,Built-in Functions for String Manipulation}).
+For now, it suffices to say that the @code{sprintf}
+function accepts a @dfn{format specification} that tells it how to format
+numbers (or strings), and that there are a number of different ways in which
+numbers can be formatted.  The different format specifications are discussed
+more fully in
+@ref{Control Letters, , Format-Control Letters}.
+
+The built-in variable @code{OFMT} contains the default format specification
+that @code{print} uses with @code{sprintf} when it wants to convert a
+number to a string for printing.
+The default value of @code{OFMT} is @code{"%.6g"}.
+By supplying different format specifications
+as the value of @code{OFMT}, you can change how @code{print} will print
+your numbers.  As a brief example:
+
+@example
+@group
+$ awk 'BEGIN @{
+>   OFMT = "%.0f"  # print numbers as integers (rounds)
+>   print 17.23 @}'
+@print{} 17
+@end group
+@end example
+
+@noindent
+@cindex dark corner
+@cindex @code{awk} language, POSIX version
+@cindex POSIX @code{awk}
+According to the POSIX standard, @code{awk}'s behavior will be undefined
+if @code{OFMT} contains anything but a floating point conversion specification
+(d.c.).
+
+@node Printf, Redirection, OFMT, Printing
+@section Using @code{printf} Statements for Fancier Printing
+@cindex formatted output
+@cindex output, formatted
+
+If you want more precise control over the output format than
+@code{print} gives you, use @code{printf}.  With @code{printf} you can
+specify the width to use for each item, and you can specify various
+formatting choices for numbers (such as what radix to use, whether to
+print an exponent, whether to print a sign, and how many digits to print
+after the decimal point).  You do this by supplying a string, called
+the @dfn{format string}, which controls how and where to print the other
+arguments.
+
+@menu
+* Basic Printf::                Syntax of the @code{printf} statement.
+* Control Letters::             Format-control letters.
+* Format Modifiers::            Format-specification modifiers.
+* Printf Examples::             Several examples.
+@end menu
+
+@node Basic Printf, Control Letters, Printf, Printf
+@subsection Introduction to the @code{printf} Statement
+
+@cindex @code{printf} statement, syntax of
+The @code{printf} statement looks like this:
+
+@example
+printf @var{format}, @var{item1}, @var{item2}, @dots{}
+@end example
+
+@noindent
+The entire list of arguments may optionally be enclosed in parentheses.  The
+parentheses are necessary if any of the item expressions use the @samp{>}
+relational operator; otherwise it could be confused with a redirection
+(@pxref{Redirection, ,Redirecting Output of @code{print} and @code{printf}}).
+
+@cindex format string
+The difference between @code{printf} and @code{print} is the @var{format}
+argument.  This is an expression whose value is taken as a string; it
+specifies how to output each of the other arguments.  It is called
+the @dfn{format string}.
+
+The format string is very similar to that in the ANSI C library function
+@code{printf}.  Most of @var{format} is text to be output verbatim.
+Scattered among this text are @dfn{format specifiers}, one per item.
+Each format specifier says to output the next item in the argument list 
+at that place in the format.
+
+The @code{printf} statement does not automatically append a newline to its
+output.  It outputs only what the format string specifies.  So if you want
+a newline, you must include one in the format string.  The output separator
+variables @code{OFS} and @code{ORS} have no effect on @code{printf}
+statements. For example:
+
+@example
+@group
+BEGIN @{
+   ORS = "\nOUCH!\n"; OFS = "!"
+   msg = "Don't Panic!"; printf "%s\n", msg
+@}
+@end group
+@end example
+
+This program still prints the familiar @samp{Don't Panic!} message.
+
+@node Control Letters, Format Modifiers, Basic Printf, Printf
+@subsection Format-Control Letters
+@cindex @code{printf}, format-control characters
+@cindex format specifier
+
+A format specifier starts with the character @samp{%} and ends with a
+@dfn{format-control letter}; it tells the @code{printf} statement how
+to output one item.  (If you actually want to output a @samp{%}, write
+@samp{%%}.)  The format-control letter specifies what kind of value to
+print.  The rest of the format specifier is made up of optional
+@dfn{modifiers} which are parameters to use, such as the field width.
+
+Here is a list of the format-control letters:
+
+@table @code
+@item c
+This prints a number as an ASCII character.  Thus, @samp{printf "%c",
+65} outputs the letter @samp{A}.  The output for a string value is
+the first character of the string.
+
+@iftex
+@page
+@end iftex
+@item d
+@itemx i
+These are equivalent. They both print a decimal integer.
+The @samp{%i} specification is for compatibility with ANSI C.
+
+@item e
+@itemx E
+This prints a number in scientific (exponential) notation.
+For example,
+
+@example
+printf "%4.3e\n", 1950
+@end example
+
+@noindent
+prints @samp{1.950e+03}, with a total of four significant figures of
+which three follow the decimal point.  The @samp{4.3} are modifiers,
+discussed below. @samp{%E} uses @samp{E} instead of @samp{e} in the output. 
+
+@item f
+This prints a number in floating point notation.
+For example,
+
+@example
+printf "%4.3f", 1950
+@end example
+
+@noindent
+prints @samp{1950.000}, with a total of four significant figures of
+which three follow the decimal point.  The @samp{4.3} are modifiers,
+discussed below.
+
+@item g
+@itemx G
+This prints a number in either scientific notation or floating point
+notation, whichever uses fewer characters. If the result is printed in
+scientific notation, @samp{%G} uses @samp{E} instead of @samp{e}.
+
+@item o
+This prints an unsigned octal integer.
+(In octal, or base-eight notation, the digits run from @samp{0} to @samp{7};
+the decimal number eight is represented as @samp{10} in octal.)
+
+@item s
+This prints a string.
+
+@item x
+@itemx X
+This prints an unsigned hexadecimal integer.
+(In hexadecimal, or base-16 notation, the digits are @samp{0} through @samp{9}
+and @samp{a} through @samp{f}.  The hexadecimal digit @samp{f} represents
+the decimal number 15.) @samp{%X} uses the letters @samp{A} through @samp{F}
+instead of @samp{a} through @samp{f}.
+
+@item %
+This isn't really a format-control letter, but it does have a meaning
+when used after a @samp{%}: the sequence @samp{%%} outputs one
+@samp{%}.  It does not consume an argument, and it ignores any
+modifiers.
+@end table
+
+@cindex dark corner
+When using the integer format-control letters for values that are outside
+the range of a C @code{long} integer, @code{gawk} will switch to the
+@samp{%g} format specifier. Other versions of @code{awk} may print
+invalid values, or do something else entirely (d.c.).
+
+@node Format Modifiers, Printf Examples, Control Letters, Printf
+@subsection Modifiers for @code{printf} Formats
+
+@cindex @code{printf}, modifiers
+@cindex modifiers (in format specifiers)
+A format specification can also include @dfn{modifiers} that can control
+how much of the item's value is printed and how much space it gets.  The
+modifiers come between the @samp{%} and the format-control letter.
+In the examples below, we use the bullet symbol ``@bullet{}'' to represent
+spaces in the output. Here are the possible modifiers, in the order in
+which they may appear:
+
+@table @code
+@item -
+The minus sign, used before the width modifier (see below),
+says to left-justify
+the argument within its specified width.  Normally the argument
+is printed right-justified in the specified width.  Thus,
+
+@example
+printf "%-4s", "foo"
+@end example
+
+@noindent
+prints @samp{foo@bullet{}}.
+
+@item @var{space}
+For numeric conversions, prefix positive values with a space, and
+negative values with a minus sign.
+
+@item +
+The plus sign, used before the width modifier (see below),
+says to always supply a sign for numeric conversions, even if the data
+to be formatted is positive. The @samp{+} overrides the space modifier.
+
+@item #
+Use an ``alternate form'' for certain control letters.
+For @samp{%o}, supply a leading zero.
+For @samp{%x}, and @samp{%X}, supply a leading @samp{0x} or @samp{0X} for
+a non-zero result.
+For @samp{%e}, @samp{%E}, and @samp{%f}, the result will always contain a
+decimal point.
+For @samp{%g}, and @samp{%G}, trailing zeros are not removed from the result.
+
+@cindex dark corner
+@item 0
+A leading @samp{0} (zero) acts as a flag, that indicates output should be
+padded with zeros instead of spaces.
+This applies even to non-numeric output formats (d.c.).
+This flag only has an effect when the field width is wider than the
+value to be printed.
+
+@item @var{width}
+This is a number specifying the desired minimum width of a field.  Inserting any
+number between the @samp{%} sign and the format control character forces the
+field to be expanded to this width.  The default way to do this is to
+pad with spaces on the left.  For example,
+
+@example
+printf "%4s", "foo"
+@end example
+
+@noindent
+prints @samp{@bullet{}foo}.
+
+The value of @var{width} is a minimum width, not a maximum.  If the item
+value requires more than @var{width} characters, it can be as wide as
+necessary.  Thus,
+
+@example
+printf "%4s", "foobar"
+@end example
+
+@noindent
+prints @samp{foobar}.
+
+Preceding the @var{width} with a minus sign causes the output to be
+padded with spaces on the right, instead of on the left.
+
+@item .@var{prec}
+This is a number that specifies the precision to use when printing.
+For the @samp{e}, @samp{E}, and @samp{f} formats, this specifies the
+number of digits you want printed to the right of the decimal point.
+For the @samp{g}, and @samp{G} formats, it specifies the maximum number
+of significant digits.  For the @samp{d}, @samp{o}, @samp{i}, @samp{u},
+@samp{x}, and @samp{X} formats, it specifies the minimum number of
+digits to print.  For a string, it specifies the maximum number of
+characters from the string that should be printed.  Thus,
+
+@example
+printf "%.4s", "foobar"
+@end example
+
+@noindent
+prints @samp{foob}.
+@end table
+
+The C library @code{printf}'s dynamic @var{width} and @var{prec}
+capability (for example, @code{"%*.*s"}) is supported.  Instead of
+supplying explicit @var{width} and/or @var{prec} values in the format
+string, you pass them in the argument list.  For example:
+
+@example
+w = 5
+p = 3
+s = "abcdefg"
+printf "%*.*s\n", w, p, s
+@end example
+
+@noindent
+is exactly equivalent to
+
+@example
+s = "abcdefg"
+printf "%5.3s\n", s
+@end example
+
+@noindent
+Both programs output @samp{@w{@bullet{}@bullet{}abc}}.
+
+Earlier versions of @code{awk} did not support this capability.
+If you must use such a version, you may simulate this feature by using
+concatenation to build up the format string, like so:
+
+@example
+w = 5
+p = 3
+s = "abcdefg"
+printf "%" w "." p "s\n", s
+@end example
+
+@noindent
+This is not particularly easy to read, but it does work.
+
+@cindex @code{awk} language, POSIX version
+@cindex POSIX @code{awk}
+C programmers may be used to supplying additional @samp{l} and @samp{h}
+flags in @code{printf} format strings. These are not valid in @code{awk}.
+Most @code{awk} implementations silently ignore these flags.
+If @samp{--lint} is provided on the command line
+(@pxref{Options, ,Command Line Options}),
+@code{gawk} will warn about their use. If @samp{--posix} is supplied,
+their use is a fatal error.
+
+@node Printf Examples,  , Format Modifiers, Printf
+@subsection Examples Using @code{printf}
+
+Here is how to use @code{printf} to make an aligned table:
+
+@example
+awk '@{ printf "%-10s %s\n", $1, $2 @}' BBS-list
+@end example
+
+@noindent
+prints the names of bulletin boards (@code{$1}) of the file
+@file{BBS-list} as a string of 10 characters, left justified.  It also
+prints the phone numbers (@code{$2}) afterward on the line.  This
+produces an aligned two-column table of names and phone numbers:
+
+@example
+@group
+$ awk '@{ printf "%-10s %s\n", $1, $2 @}' BBS-list
+@print{} aardvark   555-5553
+@print{} alpo-net   555-3412
+@print{} barfly     555-7685
+@print{} bites      555-1675
+@print{} camelot    555-0542
+@print{} core       555-2912
+@print{} fooey      555-1234
+@print{} foot       555-6699
+@print{} macfoo     555-6480
+@print{} sdace      555-3430
+@print{} sabafoo    555-2127
+@end group
+@end example
+
+Did you notice that we did not specify that the phone numbers be printed
+as numbers?  They had to be printed as strings because the numbers are
+separated by a dash.
+If we had tried to print the phone numbers as numbers, all we would have
+gotten would have been the first three digits, @samp{555}.
+This would have been pretty confusing.
+
+We did not specify a width for the phone numbers because they are the
+last things on their lines.  We don't need to put spaces after them.
+
+We could make our table look even nicer by adding headings to the tops
+of the columns.  To do this, we use the @code{BEGIN} pattern
+(@pxref{BEGIN/END, ,The @code{BEGIN} and @code{END} Special Patterns})
+to force the header to be printed only once, at the beginning of
+the @code{awk} program:
+
+@example
+@group
+awk 'BEGIN @{ print "Name      Number"
+             print "----      ------" @}
+     @{ printf "%-10s %s\n", $1, $2 @}' BBS-list
+@end group
+@end example
+
+Did you notice that we mixed @code{print} and @code{printf} statements in
+the above example?  We could have used just @code{printf} statements to get
+the same results:
+
+@example
+@group
+awk 'BEGIN @{ printf "%-10s %s\n", "Name", "Number"
+             printf "%-10s %s\n", "----", "------" @}
+     @{ printf "%-10s %s\n", $1, $2 @}' BBS-list
+@end group
+@end example
+
+@noindent
+By printing each column heading with the same format specification
+used for the elements of the column, we have made sure that the headings
+are aligned just like the columns.
+
+The fact that the same format specification is used three times can be
+emphasized by storing it in a variable, like this:
+
+@example
+@group
+awk 'BEGIN @{ format = "%-10s %s\n"
+             printf format, "Name", "Number"
+             printf format, "----", "------" @}
+     @{ printf format, $1, $2 @}' BBS-list
+@end group
+@end example
+
+@c !!! exercise
+See if you can use the @code{printf} statement to line up the headings and
+table data for our @file{inventory-shipped} example covered earlier in the
+section on the @code{print} statement
+(@pxref{Print, ,The @code{print} Statement}).
+
+@node Redirection, Special Files, Printf, Printing
+@section Redirecting Output of @code{print} and @code{printf}
+
+@cindex output redirection
+@cindex redirection of output
+So far we have been dealing only with output that prints to the standard
+output, usually your terminal.  Both @code{print} and @code{printf} can
+also send their output to other places.
+This is called @dfn{redirection}.
+
+A redirection appears after the @code{print} or @code{printf} statement.
+Redirections in @code{awk} are written just like redirections in shell
+commands, except that they are written inside the @code{awk} program.
+
+There are three forms of output redirection: output to a file,
+output appended to a file, and output through a pipe to another
+command.
+They are all shown for
+the @code{print} statement, but they work identically for @code{printf}
+also.
+
+@table @code
+@item print @var{items} > @var{output-file}
+This type of redirection prints the items into the output file
+@var{output-file}.  The file name @var{output-file} can be any
+expression.  Its value is changed to a string and then used as a
+file name (@pxref{Expressions}).
+
+When this type of redirection is used, the @var{output-file} is erased
+before the first output is written to it.  Subsequent writes
+to the same @var{output-file} do not
+erase @var{output-file}, but append to it.  If @var{output-file} does
+not exist, then it is created.
+
+For example, here is how an @code{awk} program can write a list of
+BBS names to a file @file{name-list} and a list of phone numbers to a
+file @file{phone-list}.  Each output file contains one name or number
+per line.
+
+@example
+@group
+$ awk '@{ print $2 > "phone-list"
+>        print $1 > "name-list" @}' BBS-list
+@end group
+@group
+$ cat phone-list
+@print{} 555-5553
+@print{} 555-3412
+@dots{}
+@end group
+@group
+$ cat name-list
+@print{} aardvark
+@print{} alpo-net
+@dots{}
+@end group
+@end example
+
+@item print @var{items} >> @var{output-file}
+This type of redirection prints the items into the pre-existing output file
+@var{output-file}.  The difference between this and the
+single-@samp{>} redirection is that the old contents (if any) of
+@var{output-file} are not erased.  Instead, the @code{awk} output is
+appended to the file.
+If @var{output-file} does not exist, then it is created.
+
+@cindex pipes for output
+@cindex output, piping
+@item print @var{items} | @var{command}
+It is also possible to send output to another program through a pipe
+instead of into a
+file.   This type of redirection opens a pipe to @var{command} and writes
+the values of @var{items} through this pipe, to another process created
+to execute @var{command}.
+
+The redirection argument @var{command} is actually an @code{awk}
+expression.  Its value is converted to a string, whose contents give the
+shell command to be run.
+
+For example, this produces two files, one unsorted list of BBS names
+and one list sorted in reverse alphabetical order:
+
+@example
+awk '@{ print $1 > "names.unsorted"
+       command = "sort -r > names.sorted"
+       print $1 | command @}' BBS-list
+@end example
+
+Here the unsorted list is written with an ordinary redirection while
+the sorted list is written by piping through the @code{sort} utility.
+
+This example uses redirection to mail a message to a mailing
+list @samp{bug-system}.  This might be useful when trouble is encountered
+in an @code{awk} script run periodically for system maintenance.
+
+@example
+report = "mail bug-system"
+print "Awk script failed:", $0 | report
+m = ("at record number " FNR " of " FILENAME)
+print m | report
+close(report)
+@end example
+
+The message is built using string concatenation and saved in the variable
+@code{m}.  It is then sent down the pipeline to the @code{mail} program.
+
+We call the @code{close} function here because it's a good idea to close
+the pipe as soon as all the intended output has been sent to it.
+@xref{Close Files And Pipes, ,Closing Input and Output Files and Pipes},
+for more information
+on this.  This example also illustrates the use of a variable to represent
+a @var{file} or @var{command}: it is not necessary to always
+use a string constant.  Using a variable is generally a good idea,
+since @code{awk} requires you to spell the string value identically
+every time.
+@end table
+
+Redirecting output using @samp{>}, @samp{>>}, or @samp{|} asks the system
+to open a file or pipe only if the particular @var{file} or @var{command}
+you've specified has not already been written to by your program, or if
+it has been closed since it was last written to.
+
+@cindex differences between @code{gawk} and @code{awk}
+@cindex limitations
+@cindex implementation limits
+Many @code{awk} implementations limit the number of pipelines an @code{awk}
+program may have open to just one!  In @code{gawk}, there is no such limit.
+You can open as many pipelines as the underlying operating system will
+permit.
+
+@node Special Files, Close Files And Pipes , Redirection, Printing
+@section Special File Names in @code{gawk}
+@cindex standard input
+@cindex standard output
+@cindex standard error output
+@cindex file descriptors
+
+Running programs conventionally have three input and output streams
+already available to them for reading and writing.  These are known as
+the @dfn{standard input}, @dfn{standard output}, and @dfn{standard error
+output}.  These streams are, by default, connected to your terminal, but
+they are often redirected with the shell, via the @samp{<}, @samp{<<},
+@samp{>}, @samp{>>}, @samp{>&} and @samp{|} operators.  Standard error
+is typically used for writing error messages; the reason we have two separate
+streams, standard output and standard error, is so that they can be
+redirected separately.
+
+@cindex differences between @code{gawk} and @code{awk}
+In other implementations of @code{awk}, the only way to write an error
+message to standard error in an @code{awk} program is as follows:
+
+@example
+print "Serious error detected!" | "cat 1>&2"
+@end example
+
+@noindent
+This works by opening a pipeline to a shell command which can access the
+standard error stream which it inherits from the @code{awk} process.
+This is far from elegant, and is also inefficient, since it requires a
+separate process.  So people writing @code{awk} programs often
+neglect to do this.  Instead, they send the error messages to the
+terminal, like this:
+
+@example
+@group
+print "Serious error detected!" > "/dev/tty"
+@end group
+@end example
+
+@noindent
+This usually has the same effect, but not always: although the
+standard error stream is usually the terminal, it can be redirected, and
+when that happens, writing to the terminal is not correct.  In fact, if
+@code{awk} is run from a background job, it may not have a terminal at all.
+Then opening @file{/dev/tty} will fail.
+
+@code{gawk} provides special file names for accessing the three standard
+streams.  When you redirect input or output in @code{gawk}, if the file name
+matches one of these special names, then @code{gawk} directly uses the
+stream it stands for.
+
+@cindex @file{/dev/stdin}
+@cindex @file{/dev/stdout}
+@cindex @file{/dev/stderr}
+@cindex @file{/dev/fd}
+@c @cartouche
+@table @file
+@item /dev/stdin
+The standard input (file descriptor 0).
+
+@item /dev/stdout
+The standard output (file descriptor 1).
+
+@item /dev/stderr
+The standard error output (file descriptor 2).
+
+@item /dev/fd/@var{N}
+The file associated with file descriptor @var{N}.  Such a file must have
+been opened by the program initiating the @code{awk} execution (typically
+the shell).  Unless you take special pains in the shell from which
+you invoke @code{gawk}, only descriptors 0, 1 and 2 are available.
+@end table
+@c @end cartouche
+
+The file names @file{/dev/stdin}, @file{/dev/stdout}, and @file{/dev/stderr}
+are aliases for @file{/dev/fd/0}, @file{/dev/fd/1}, and @file{/dev/fd/2},
+respectively, but they are more self-explanatory.
+
+The proper way to write an error message in a @code{gawk} program
+is to use @file{/dev/stderr}, like this:
+
+@example
+print "Serious error detected!" > "/dev/stderr"
+@end example
+
+@code{gawk} also provides special file names that give access to information
+about the running @code{gawk} process.  Each of these ``files'' provides
+a single record of information.  To read them more than once, you must
+first close them with the @code{close} function
+(@pxref{Close Files And Pipes, ,Closing Input and Output Files and Pipes}).
+The filenames are:
+
+@cindex process information
+@cindex @file{/dev/pid}
+@cindex @file{/dev/pgrpid}
+@cindex @file{/dev/ppid}
+@cindex @file{/dev/user}
+@c @cartouche
+@table @file
+@item /dev/pid
+Reading this file returns the process ID of the current process,
+in decimal, terminated with a newline.
+
+@item  /dev/ppid
+Reading this file returns the parent process ID of the current process,
+in decimal, terminated with a newline.
+
+@item  /dev/pgrpid
+Reading this file returns the process group ID of the current process,
+in decimal, terminated with a newline.
+
+@item /dev/user
+Reading this file returns a single record terminated with a newline.
+The fields are separated with spaces.  The fields represent the
+following information:
+
+@table @code
+@item $1
+The return value of the @code{getuid} system call
+(the real user ID number).
+
+@item $2
+The return value of the @code{geteuid} system call
+(the effective user ID number).
+
+@item $3
+The return value of the @code{getgid} system call
+(the real group ID number).
+
+@item $4
+The return value of the @code{getegid} system call
+(the effective group ID number).
+@end table
+
+If there are any additional fields, they are the group IDs returned by
+@code{getgroups} system call.
+(Multiple groups may not be supported on all systems.)
+@end table
+@c @end cartouche
+
+These special file names may be used on the command line as data
+files, as well as for I/O redirections within an @code{awk} program.
+They may not be used as source files with the @samp{-f} option.
+
+Recognition of these special file names is disabled if @code{gawk} is in
+compatibility mode (@pxref{Options, ,Command Line Options}).
+
+@strong{Caution}:  Unless your system actually has a @file{/dev/fd} directory
+(or any of the other above listed special files),
+the interpretation of these file names is done by @code{gawk} itself.
+For example, using @samp{/dev/fd/4} for output will actually write on
+file descriptor 4, and not on a new file descriptor that was @code{dup}'ed
+from file descriptor 4.  Most of the time this does not matter; however, it
+is important to @emph{not} close any of the files related to file descriptors
+0, 1, and 2.  If you do close one of these files, unpredictable behavior
+will result.
+
+The special files that provide process-related information may disappear
+in a future version of @code{gawk}.
+@xref{Future Extensions, ,Probable Future Extensions}.
+
+@node Close Files And Pipes, , Special Files, Printing
+@section Closing Input and Output Files and Pipes
+@cindex closing input files and pipes
+@cindex closing output files and pipes
+@findex close
+
+If the same file name or the same shell command is used with
+@code{getline}
+(@pxref{Getline, ,Explicit Input with @code{getline}})
+more than once during the execution of an @code{awk}
+program, the file is opened (or the command is executed) only the first time.
+At that time, the first record of input is read from that file or command.
+The next time the same file or command is used in @code{getline}, another
+record is read from it, and so on.
+
+Similarly, when a file or pipe is opened for output, the file name or command
+associated with
+it is remembered by @code{awk} and subsequent writes to the same file or
+command are appended to the previous writes.  The file or pipe stays
+open until @code{awk} exits.
+
+This implies that if you want to start reading the same file again from
+the beginning, or if you want to rerun a shell command (rather than
+reading more output from the command), you must take special steps.
+What you must do is use the @code{close} function, as follows:
+
+@example
+close(@var{filename})
+@end example
+
+@noindent
+or
+
+@example
+close(@var{command})
+@end example
+
+The argument @var{filename} or @var{command} can be any expression.  Its
+value must @emph{exactly} match the string that was used to open the file or
+start the command (spaces and other ``irrelevant'' characters
+included). For example, if you open a pipe with this:
+
+@example
+"sort -r names" | getline foo
+@end example
+
+@noindent
+then you must close it with this:
+
+@example
+close("sort -r names")
+@end example
+
+Once this function call is executed, the next @code{getline} from that
+file or command, or the next @code{print} or @code{printf} to that
+file or command, will reopen the file or rerun the command.
+
+Because the expression that you use to close a file or pipeline must
+exactly match the expression used to open the file or run the command,
+it is good practice to use a variable to store the file name or command.
+The previous example would become
+
+@example
+sortcom = "sort -r names"
+sortcom | getline foo
+@dots{}
+close(sortcom)
+@end example
+
+@noindent
+This helps avoid hard-to-find typographical errors in your @code{awk}
+programs.
+
+Here are some reasons why you might need to close an output file:
+
+@itemize @bullet
+@item
+To write a file and read it back later on in the same @code{awk}
+program.  Close the file when you are finished writing it; then
+you can start reading it with @code{getline}.
+
+@item
+To write numerous files, successively, in the same @code{awk}
+program.  If you don't close the files, eventually you may exceed a
+system limit on the number of open files in one process.  So close
+each one when you are finished writing it.
+
+@item
+To make a command finish.  When you redirect output through a pipe,
+the command reading the pipe normally continues to try to read input
+as long as the pipe is open.  Often this means the command cannot
+really do its work until the pipe is closed.  For example, if you
+redirect output to the @code{mail} program, the message is not
+actually sent until the pipe is closed.
+
+@item
+To run the same program a second time, with the same arguments.
+This is not the same thing as giving more input to the first run!
+
+For example, suppose you pipe output to the @code{mail} program.  If you
+output several lines redirected to this pipe without closing it, they make
+a single message of several lines.  By contrast, if you close the pipe
+after each line of output, then each line makes a separate message.
+@end itemize
+
+@vindex ERRNO
+@cindex differences between @code{gawk} and @code{awk}
+@code{close} returns a value of zero if the close succeeded.
+Otherwise, the value will be non-zero.
+In this case, @code{gawk} sets the variable @code{ERRNO} to a string
+describing the error that occurred.
+
+@cindex differences between @code{gawk} and @code{awk}
+@cindex portability issues
+If you use more files than the system allows you to have open,
+@code{gawk} will attempt to multiplex the available open files among
+your data files.  @code{gawk}'s ability to do this depends upon the
+facilities of your operating system: it may not always work.  It is
+therefore both good practice and good portability advice to always
+use @code{close} on your files when you are done with them.
+
+@node Expressions, Patterns and Actions, Printing, Top
+@chapter Expressions
+@cindex expression
+
+Expressions are the basic building blocks of @code{awk} patterns
+and actions.  An expression evaluates to a value, which you can print, test,
+store in a variable or pass to a function.  Additionally, an expression
+can assign a new value to a variable or a field, with an assignment operator.
+
+An expression can serve as a pattern or action statement on its own.
+Most other kinds of
+statements contain one or more expressions which specify data on which to
+operate.  As in other languages, expressions in @code{awk} include
+variables, array references, constants, and function calls, as well as
+combinations of these with various operators.
+
+@menu
+* Constants::                   String, numeric, and regexp constants.
+* Using Constant Regexps::      When and how to use a regexp constant.
+* Variables::                   Variables give names to values for later use.
+* Conversion::                  The conversion of strings to numbers and vice
+                                versa.
+* Arithmetic Ops::              Arithmetic operations (@samp{+}, @samp{-},
+                                etc.)
+* Concatenation::               Concatenating strings.
+* Assignment Ops::              Changing the value of a variable or a field.
+* Increment Ops::               Incrementing the numeric value of a variable.
+* Truth Values::                What is ``true'' and what is ``false''.
+* Typing and Comparison::       How variables acquire types, and how this
+                                affects comparison of numbers and strings with
+                                @samp{<}, etc.
+* Boolean Ops::                 Combining comparison expressions using boolean
+                                operators @samp{||} (``or''), @samp{&&}
+                                (``and'') and @samp{!} (``not'').
+* Conditional Exp::             Conditional expressions select between two
+                                subexpressions under control of a third
+                                subexpression.
+* Function Calls::              A function call is an expression.
+* Precedence::                  How various operators nest.
+@end menu
+
+@node Constants, Using Constant Regexps, Expressions, Expressions
+@section Constant Expressions
+@cindex constants, types of
+@cindex string constants
+
+The simplest type of expression is the @dfn{constant}, which always has
+the same value.  There are three types of constants: numeric constants,
+string constants, and regular expression constants.
+
+@menu
+* Scalar Constants::            Numeric and string constants.
+* Regexp Constants::            Regular Expression constants.
+@end menu
+
+@node Scalar Constants, Regexp Constants, Constants, Constants
+@subsection Numeric and String Constants
+
+@cindex numeric constant
+@cindex numeric value
+A @dfn{numeric constant} stands for a number.  This number can be an
+integer, a decimal fraction, or a number in scientific (exponential)
+notation.@footnote{The internal representation uses double-precision
+floating point numbers. If you don't know what that means, then don't
+worry about it.} Here are some examples of numeric constants, which all
+have the same value:
+
+@example
+105
+1.05e+2
+1050e-1
+@end example
+
+A string constant consists of a sequence of characters enclosed in
+double-quote marks.  For example:
+
+@example
+"parrot"
+@end example
+
+@noindent
+@cindex differences between @code{gawk} and @code{awk}
+represents the string whose contents are @samp{parrot}.  Strings in
+@code{gawk} can be of any length and they can contain any of the possible
+eight-bit ASCII characters including ASCII NUL (character code zero).
+Other @code{awk}
+implementations may have difficulty with some character codes.
+
+@node Regexp Constants,  , Scalar Constants, Constants
+@subsection Regular Expression Constants
+
+@cindex @code{~} operator
+@cindex @code{!~} operator
+A regexp constant is a regular expression description enclosed in
+slashes, such as @code{@w{/^beginning and end$/}}.  Most regexps used in
+@code{awk} programs are constant, but the @samp{~} and @samp{!~}
+matching operators can also match computed or ``dynamic'' regexps
+(which are just ordinary strings or variables that contain a regexp).
+
+@node Using Constant Regexps, Variables, Constants, Expressions
+@section Using Regular Expression Constants
+
+When used on the right hand side of the @samp{~} or @samp{!~}
+operators, a regexp constant merely stands for the regexp that is to be
+matched.
+
+@cindex dark corner
+Regexp constants (such as @code{/foo/}) may be used like simple expressions.
+When a
+regexp constant appears by itself, it has the same meaning as if it appeared
+in a pattern, i.e.@: @samp{($0 ~ /foo/)} (d.c.)
+(@pxref{Expression Patterns, ,Expressions as Patterns}).
+This means that the two code segments,
+
+@example
+if ($0 ~ /barfly/ || $0 ~ /camelot/)
+    print "found"
+@end example
+
+@noindent
+and
+
+@example
+if (/barfly/ || /camelot/)
+    print "found"
+@end example
+
+@noindent
+are exactly equivalent.
+
+One rather bizarre consequence of this rule is that the following
+boolean expression is valid, but does not do what the user probably
+intended:
+
+@example
+# note that /foo/ is on the left of the ~
+if (/foo/ ~ $1) print "found foo"
+@end example
+
+@noindent
+This code is ``obviously'' testing @code{$1} for a match against the regexp
+@code{/foo/}.  But in fact, the expression @samp{/foo/ ~ $1} actually means
+@samp{($0 ~ /foo/) ~ $1}.  In other words, first match the input record
+against the regexp @code{/foo/}.  The result will be either zero or one,
+depending upon the success or failure of the match.  Then match that result
+against the first field in the record.
+
+Since it is unlikely that you would ever really wish to make this kind of
+test, @code{gawk} will issue a warning when it sees this construct in
+a program.
+
+Another consequence of this rule is that the assignment statement
+
+@example
+matches = /foo/
+@end example
+
+@noindent
+will assign either zero or one to the variable @code{matches}, depending
+upon the contents of the current input record.
+
+This feature of the language was never well documented until the
+POSIX specification.
+
+@cindex differences between @code{gawk} and @code{awk}
+@cindex dark corner
+Constant regular expressions are also used as the first argument for
+the @code{gensub}, @code{sub} and @code{gsub} functions, and as the
+second argument of the @code{match} function
+(@pxref{String Functions, ,Built-in Functions for String Manipulation}).
+Modern implementations of @code{awk}, including @code{gawk}, allow
+the third argument of @code{split} to be a regexp constant, while some
+older implementations do not (d.c.).
+
+This can lead to confusion when attempting to use regexp constants
+as arguments to user defined functions
+(@pxref{User-defined, , User-defined Functions}).
+For example:
+
+@example
+function mysub(pat, repl, str, global)
+@{
+    if (global)
+        gsub(pat, repl, str)
+    else
+        sub(pat, repl, str)
+    return str
+@}
+
+@{
+    @dots{}
+    text = "hi! hi yourself!"
+    mysub(/hi/, "howdy", text, 1)
+    @dots{}
+@}
+@end example
+
+In this example, the programmer wishes to pass a regexp constant to the
+user-defined function @code{mysub}, which will in turn pass it on to
+either @code{sub} or @code{gsub}.  However, what really happens is that
+the @code{pat} parameter will be either one or zero, depending upon whether
+or not @code{$0} matches @code{/hi/}.
+
+As it is unlikely that you would ever really wish to pass a truth value
+in this way, @code{gawk} will issue a warning when it sees a regexp
+constant used as a parameter to a user-defined function.
+
+@node Variables, Conversion, Using Constant Regexps, Expressions
+@section Variables
+
+Variables are ways of storing values at one point in your program for
+use later in another part of your program.  You can manipulate them
+entirely within your program text, and you can also assign values to
+them on the @code{awk} command line.
+
+@menu
+* Using Variables::             Using variables in your programs.
+* Assignment Options::          Setting variables on the command line and a
+                                summary of command line syntax. This is an
+                                advanced method of input.
+@end menu
+
+@node Using Variables, Assignment Options, Variables, Variables
+@subsection Using Variables in a Program
+
+@cindex variables, user-defined
+@cindex user-defined variables
+Variables let you give names to values and refer to them later.  You have
+already seen variables in many of the examples.  The name of a variable
+must be a sequence of letters, digits and underscores, but it may not begin
+with a digit.  Case is significant in variable names; @code{a} and @code{A}
+are distinct variables.
+
+A variable name is a valid expression by itself; it represents the
+variable's current value.  Variables are given new values with
+@dfn{assignment operators}, @dfn{increment operators} and
+@dfn{decrement operators}.
+@xref{Assignment Ops, ,Assignment Expressions}.
+
+A few variables have special built-in meanings, such as @code{FS}, the
+field separator, and @code{NF}, the number of fields in the current
+input record.  @xref{Built-in Variables}, for a list of them.  These
+built-in variables can be used and assigned just like all other
+variables, but their values are also used or changed automatically by
+@code{awk}.  All built-in variables names are entirely upper-case.
+
+Variables in @code{awk} can be assigned either numeric or string
+values.  By default, variables are initialized to the empty string, which
+is zero if converted to a number.  There is no need to
+``initialize'' each variable explicitly in @code{awk},
+the way you would in C and in most other traditional languages.
+
+@node Assignment Options,  , Using Variables, Variables
+@subsection Assigning Variables on the Command Line
+
+You can set any @code{awk} variable by including a @dfn{variable assignment}
+among the arguments on the command line when you invoke @code{awk}
+(@pxref{Other Arguments, ,Other Command Line Arguments}).  Such an assignment has
+this form:
+
+@example
+@var{variable}=@var{text}
+@end example
+
+@noindent
+With it, you can set a variable either at the beginning of the
+@code{awk} run or in between input files.
+
+If you precede the assignment with the @samp{-v} option, like this:
+
+@example
+-v @var{variable}=@var{text}
+@end example
+
+@noindent
+then the variable is set at the very beginning, before even the
+@code{BEGIN} rules are run.  The @samp{-v} option and its assignment
+must precede all the file name arguments, as well as the program text.
+(@xref{Options, ,Command Line Options}, for more information about
+the @samp{-v} option.)
+
+Otherwise, the variable assignment is performed at a time determined by
+its position among the input file arguments: after the processing of the
+preceding input file argument.  For example:
+
+@example
+awk '@{ print $n @}' n=4 inventory-shipped n=2 BBS-list
+@end example
+
+@noindent
+prints the value of field number @code{n} for all input records.  Before
+the first file is read, the command line sets the variable @code{n}
+equal to four.  This causes the fourth field to be printed in lines from
+the file @file{inventory-shipped}.  After the first file has finished,
+but before the second file is started, @code{n} is set to two, so that the
+second field is printed in lines from @file{BBS-list}.
+
+@example
+@group
+$ awk '@{ print $n @}' n=4 inventory-shipped n=2 BBS-list
+@print{} 15
+@print{} 24
+@dots{}
+@print{} 555-5553
+@print{} 555-3412
+@dots{}
+@end group
+@end example
+
+Command line arguments are made available for explicit examination by
+the @code{awk} program in an array named @code{ARGV}
+(@pxref{ARGC and ARGV, ,Using @code{ARGC} and @code{ARGV}}).
+
+@cindex dark corner
+@code{awk} processes the values of command line assignments for escape
+sequences (d.c.) (@pxref{Escape Sequences}).
+
+@node Conversion, Arithmetic Ops, Variables, Expressions
+@section Conversion of Strings and Numbers
+
+@cindex conversion of strings and numbers
+Strings are converted to numbers, and numbers to strings, if the context
+of the @code{awk} program demands it.  For example, if the value of
+either @code{foo} or @code{bar} in the expression @samp{foo + bar}
+happens to be a string, it is converted to a number before the addition
+is performed.  If numeric values appear in string concatenation, they
+are converted to strings.  Consider this:
+
+@example
+two = 2; three = 3
+print (two three) + 4
+@end example
+
+@noindent
+This prints the (numeric) value 27.  The numeric values of
+the variables @code{two} and @code{three} are converted to strings and
+concatenated together, and the resulting string is converted back to the
+number 23, to which four is then added.
+
+@cindex null string
+@cindex empty string
+@cindex type conversion
+If, for some reason, you need to force a number to be converted to a
+string, concatenate the empty string, @code{""}, with that number.
+To force a string to be converted to a number, add zero to that string.
+
+A string is converted to a number by interpreting any numeric prefix
+of the string as numerals:
+@code{"2.5"} converts to 2.5, @code{"1e3"} converts to 1000, and @code{"25fix"}
+has a numeric value of 25.
+Strings that can't be interpreted as valid numbers are converted to
+zero.
+
+@vindex CONVFMT
+The exact manner in which numbers are converted into strings is controlled
+by the @code{awk} built-in variable @code{CONVFMT} (@pxref{Built-in Variables}).
+Numbers are converted using the @code{sprintf} function
+(@pxref{String Functions, ,Built-in Functions for String Manipulation})
+with @code{CONVFMT} as the format
+specifier.
+
+@code{CONVFMT}'s default value is @code{"%.6g"}, which prints a value with
+at least six significant digits.  For some applications you will want to
+change it to specify more precision.  Double precision on most modern
+machines gives you 16 or 17 decimal digits of precision.
+
+Strange results can happen if you set @code{CONVFMT} to a string that doesn't
+tell @code{sprintf} how to format floating point numbers in a useful way.
+For example, if you forget the @samp{%} in the format, all numbers will be
+converted to the same constant string.
+
+@cindex dark corner
+As a special case, if a number is an integer, then the result of converting
+it to a string is @emph{always} an integer, no matter what the value of
+@code{CONVFMT} may be.  Given the following code fragment:
+
+@example
+CONVFMT = "%2.2f"
+a = 12
+b = a ""
+@end example
+
+@noindent
+@code{b} has the value @code{"12"}, not @code{"12.00"} (d.c.).
+
+@cindex @code{awk} language, POSIX version
+@cindex POSIX @code{awk}
+@vindex OFMT
+Prior to the POSIX standard, @code{awk} specified that the value
+of @code{OFMT} was used for converting numbers to strings.  @code{OFMT}
+specifies the output format to use when printing numbers with @code{print}.
+@code{CONVFMT} was introduced in order to separate the semantics of
+conversion from the semantics of printing.  Both @code{CONVFMT} and
+@code{OFMT} have the same default value: @code{"%.6g"}.  In the vast majority
+of cases, old @code{awk} programs will not change their behavior.
+However, this use of @code{OFMT} is something to keep in mind if you must
+port your program to other implementations of @code{awk}; we recommend
+that instead of changing your programs, you just port @code{gawk} itself!
+@xref{Print, ,The @code{print} Statement},
+for more information on the @code{print} statement.
+
+@node Arithmetic Ops, Concatenation, Conversion, Expressions
+@section Arithmetic Operators
+@cindex arithmetic operators
+@cindex operators, arithmetic
+@cindex addition
+@cindex subtraction
+@cindex multiplication
+@cindex division
+@cindex remainder
+@cindex quotient
+@cindex exponentiation
+
+The @code{awk} language uses the common arithmetic operators when
+evaluating expressions.  All of these arithmetic operators follow normal
+precedence rules, and work as you would expect them to.
+
+Here is a file @file{grades} containing a list of student names and
+three test scores per student (it's a small class):
+
+@example
+Pat   100 97 58
+Sandy  84 72 93
+Chris  72 92 89
+@end example
+
+@noindent
+This programs takes the file @file{grades}, and prints the average
+of the scores.
+
+@example
+$ awk '@{ sum = $2 + $3 + $4 ; avg = sum / 3
+>        print $1, avg @}' grades
+@print{} Pat 85
+@print{} Sandy 83
+@print{} Chris 84.3333
+@end example
+
+This table lists the arithmetic operators in @code{awk}, in order from
+highest precedence to lowest:
+
+@c sigh. this seems necessary
+@iftex
+@page
+@end iftex
+@c @cartouche
+@table @code
+@item - @var{x}
+Negation.
+
+@item + @var{x}
+Unary plus.  The expression is converted to a number.
+
+@cindex @code{awk} language, POSIX version
+@cindex POSIX @code{awk}
+@item @var{x} ^ @var{y}
+@itemx @var{x} ** @var{y}
+Exponentiation: @var{x} raised to the @var{y} power.  @samp{2 ^ 3} has
+the value eight.  The character sequence @samp{**} is equivalent to
+@samp{^}.  (The POSIX standard only specifies the use of @samp{^}
+for exponentiation.)
+
+@item @var{x} * @var{y}
+Multiplication.
+
+@item @var{x} / @var{y}
+Division.  Since all numbers in @code{awk} are
+real numbers, the result is not rounded to an integer: @samp{3 / 4}
+has the value 0.75.
+
+@item @var{x} % @var{y}
+@cindex differences between @code{gawk} and @code{awk}
+Remainder.  The quotient is rounded toward zero to an integer,
+multiplied by @var{y} and this result is subtracted from @var{x}.
+This operation is sometimes known as ``trunc-mod.''  The following
+relation always holds:
+
+@example
+b * int(a / b) + (a % b) == a
+@end example
+
+One possibly undesirable effect of this definition of remainder is that
+@code{@var{x} % @var{y}} is negative if @var{x} is negative.  Thus,
+
+@example
+-17 % 8 = -1
+@end example
+
+In other @code{awk} implementations, the signedness of the remainder
+may be machine dependent.
+@c !!! what does posix say?
+
+@item @var{x} + @var{y}
+Addition.
+
+@item @var{x} - @var{y}
+Subtraction.
+@end table
+@c @end cartouche
+
+For maximum portability, do not use the @samp{**} operator.
+
+Unary plus and minus have the same precedence,
+the multiplication operators all have the same precedence, and
+addition and subtraction have the same precedence.
+
+@node Concatenation, Assignment Ops, Arithmetic Ops, Expressions
+@section String Concatenation
+
+@cindex string operators
+@cindex operators, string
+@cindex concatenation
+There is only one string operation: concatenation.  It does not have a
+specific operator to represent it.  Instead, concatenation is performed by
+writing expressions next to one another, with no operator.  For example:
+
+@example
+@group
+$ awk '@{ print "Field number one: " $1 @}' BBS-list
+@print{} Field number one: aardvark
+@print{} Field number one: alpo-net
+@dots{}
+@end group
+@end example
+
+Without the space in the string constant after the @samp{:}, the line
+would run together.  For example:
+
+@example
+@group
+$ awk '@{ print "Field number one:" $1 @}' BBS-list
+@print{} Field number one:aardvark
+@print{} Field number one:alpo-net
+@dots{}
+@end group
+@end example
+
+Since string concatenation does not have an explicit operator, it is
+often necessary to insure that it happens where you want it to by
+using parentheses to enclose
+the items to be concatenated.  For example, the
+following code fragment does not concatenate @code{file} and @code{name}
+as you might expect:
+
+@example
+file = "file"
+name = "name"
+print "something meaningful" > file name
+@end example
+
+@noindent
+It is necessary to use the following:
+
+@example
+print "something meaningful" > (file name)
+@end example
+
+We recommend that you use parentheses around concatenation in all but the
+most common contexts (such as on the right-hand side of @samp{=}).
+
+@node Assignment Ops, Increment Ops, Concatenation, Expressions
+@section Assignment Expressions
+@cindex assignment operators
+@cindex operators, assignment
+@cindex expression, assignment
+
+An @dfn{assignment} is an expression that stores a new value into a
+variable.  For example, let's assign the value one to the variable
+@code{z}:
+
+@example
+z = 1
+@end example
+
+After this expression is executed, the variable @code{z} has the value one.
+Whatever old value @code{z} had before the assignment is forgotten.
+
+Assignments can store string values also.  For example, this would store
+the value @code{"this food is good"} in the variable @code{message}:
+
+@example
+thing = "food"
+predicate = "good"
+message = "this " thing " is " predicate
+@end example
+
+@noindent
+(This also illustrates string concatenation.)
+
+The @samp{=} sign is called an @dfn{assignment operator}.  It is the
+simplest assignment operator because the value of the right-hand
+operand is stored unchanged.
+
+@cindex side effect
+Most operators (addition, concatenation, and so on) have no effect
+except to compute a value.  If you ignore the value, you might as well
+not use the operator.  An assignment operator is different; it does
+produce a value, but even if you ignore the value, the assignment still
+makes itself felt through the alteration of the variable.  We call this
+a @dfn{side effect}.
+
+@cindex lvalue
+@cindex rvalue
+The left-hand operand of an assignment need not be a variable
+(@pxref{Variables}); it can also be a field
+(@pxref{Changing Fields, ,Changing the Contents of a Field}) or
+an array element (@pxref{Arrays, ,Arrays in @code{awk}}).
+These are all called @dfn{lvalues},
+which means they can appear on the left-hand side of an assignment operator.
+The right-hand operand may be any expression; it produces the new value
+which the assignment stores in the specified variable, field or array
+element. (Such values are called @dfn{rvalues}).
+
+@cindex types of variables
+It is important to note that variables do @emph{not} have permanent types.
+The type of a variable is simply the type of whatever value it happens
+to hold at the moment.  In the following program fragment, the variable
+@code{foo} has a numeric value at first, and a string value later on:
+
+@example
+foo = 1
+print foo
+foo = "bar"
+print foo
+@end example
+
+@noindent
+When the second assignment gives @code{foo} a string value, the fact that
+it previously had a numeric value is forgotten.
+
+String values that do not begin with a digit have a numeric value of
+zero. After executing this code, the value of @code{foo} is five:
+
+@example
+foo = "a string"
+foo = foo + 5
+@end example
+
+@noindent
+(Note that using a variable as a number and then later as a string can
+be confusing and is poor programming style.  The above examples illustrate how
+@code{awk} works, @emph{not} how you should write your own programs!)
+
+An assignment is an expression, so it has a value: the same value that
+is assigned.  Thus, @samp{z = 1} as an expression has the value one.
+One consequence of this is that you can write multiple assignments together:
+
+@example
+x = y = z = 0
+@end example
+
+@noindent
+stores the value zero in all three variables.  It does this because the
+value of @samp{z = 0}, which is zero, is stored into @code{y}, and then
+the value of @samp{y = z = 0}, which is zero, is stored into @code{x}.
+
+You can use an assignment anywhere an expression is called for.  For
+example, it is valid to write @samp{x != (y = 1)} to set @code{y} to one
+and then test whether @code{x} equals one.  But this style tends to make
+programs hard to read; except in a one-shot program, you should
+not use such nesting of assignments.
+
+Aside from @samp{=}, there are several other assignment operators that
+do arithmetic with the old value of the variable.  For example, the
+operator @samp{+=} computes a new value by adding the right-hand value
+to the old value of the variable.  Thus, the following assignment adds
+five to the value of @code{foo}:
+
+@example
+foo += 5
+@end example
+
+@noindent
+This is equivalent to the following:
+
+@example
+foo = foo + 5
+@end example
+
+@noindent
+Use whichever one makes the meaning of your program clearer.
+
+There are situations where using @samp{+=} (or any assignment operator)
+is @emph{not} the same as simply repeating the left-hand operand in the
+right-hand expression.  For example:
+
+@cindex Rankin, Pat
+@example
+@group
+# Thanks to Pat Rankin for this example
+BEGIN  @{
+    foo[rand()] += 5
+    for (x in foo)
+       print x, foo[x]
+
+    bar[rand()] = bar[rand()] + 5
+    for (x in bar)
+       print x, bar[x]
+@}
+@end group
+@end example
+
+@noindent
+The indices of @code{bar} are guaranteed to be different, because
+@code{rand} will return different values each time it is called.
+(Arrays and the @code{rand} function haven't been covered yet.
+@xref{Arrays, ,Arrays in @code{awk}},
+and see @ref{Numeric Functions, ,Numeric Built-in Functions}, for more information).
+This example illustrates an important fact about the assignment
+operators: the left-hand expression is only evaluated @emph{once}.
+
+It is also up to the implementation as to which expression is evaluated
+first, the left-hand one or the right-hand one.
+Consider this example:
+
+@example
+i = 1
+a[i += 2] = i + 1
+@end example
+
+@noindent
+The value of @code{a[3]} could be either two or four.
+
+Here is a table of the arithmetic assignment operators.  In each
+case, the right-hand operand is an expression whose value is converted
+to a number.
+
+@c @cartouche
+@table @code
+@item @var{lvalue} += @var{increment}
+Adds @var{increment} to the value of @var{lvalue} to make the new value
+of @var{lvalue}.
+
+@item @var{lvalue} -= @var{decrement}
+Subtracts @var{decrement} from the value of @var{lvalue}.
+
+@item @var{lvalue} *= @var{coefficient}
+Multiplies the value of @var{lvalue} by @var{coefficient}.
+
+@item @var{lvalue} /= @var{divisor}
+Divides the value of @var{lvalue} by @var{divisor}.
+
+@item @var{lvalue} %= @var{modulus}
+Sets @var{lvalue} to its remainder by @var{modulus}.
+
+@cindex @code{awk} language, POSIX version
+@cindex POSIX @code{awk}
+@item @var{lvalue} ^= @var{power}
+@itemx @var{lvalue} **= @var{power}
+Raises @var{lvalue} to the power @var{power}.
+(Only the @samp{^=} operator is specified by POSIX.)
+@end table
+@c @end cartouche
+
+For maximum portability, do not use the @samp{**=} operator.
+
+@node Increment Ops, Truth Values, Assignment Ops, Expressions
+@section Increment and Decrement Operators
+
+@cindex increment operators
+@cindex operators, increment
+@dfn{Increment} and @dfn{decrement operators} increase or decrease the value of
+a variable by one.  You could do the same thing with an assignment operator, so
+the increment operators add no power to the @code{awk} language; but they
+are convenient abbreviations for very common operations.
+
+The operator to add one is written @samp{++}.  It can be used to increment
+a variable either before or after taking its value.
+
+To pre-increment a variable @var{v}, write @samp{++@var{v}}.  This adds
+one to the value of @var{v} and that new value is also the value of this
+expression.  The assignment expression @samp{@var{v} += 1} is completely
+equivalent.
+
+Writing the @samp{++} after the variable specifies post-increment.  This
+increments the variable value just the same; the difference is that the
+value of the increment expression itself is the variable's @emph{old}
+value.  Thus, if @code{foo} has the value four, then the expression @samp{foo++}
+has the value four, but it changes the value of @code{foo} to five.
+
+The post-increment @samp{foo++} is nearly equivalent to writing @samp{(foo
++= 1) - 1}.  It is not perfectly equivalent because all numbers in
+@code{awk} are floating point: in floating point, @samp{foo + 1 - 1} does
+not necessarily equal @code{foo}.  But the difference is minute as
+long as you stick to numbers that are fairly small (less than 10e12).
+
+Any lvalue can be incremented.  Fields and array elements are incremented
+just like variables.  (Use @samp{$(i++)} when you wish to do a field reference
+and a variable increment at the same time.  The parentheses are necessary
+because of the precedence of the field reference operator, @samp{$}.)
+
+@cindex decrement operators
+@cindex operators, decrement
+The decrement operator @samp{--} works just like @samp{++} except that
+it subtracts one instead of adding.  Like @samp{++}, it can be used before
+the lvalue to pre-decrement or after it to post-decrement.
+
+Here is a summary of increment and decrement expressions.
+
+@c @cartouche
+@table @code
+@item ++@var{lvalue}
+This expression increments @var{lvalue} and the new value becomes the
+value of the expression.
+
+@item @var{lvalue}++
+This expression increments @var{lvalue}, but
+the value of the expression is the @emph{old} value of @var{lvalue}.
+
+@item --@var{lvalue}
+Like @samp{++@var{lvalue}}, but instead of adding, it subtracts.  It
+decrements @var{lvalue} and delivers the value that results.
+
+@item @var{lvalue}--
+Like @samp{@var{lvalue}++}, but instead of adding, it subtracts.  It
+decrements @var{lvalue}.  The value of the expression is the @emph{old}
+value of @var{lvalue}.
+@end table
+@c @end cartouche
+
+@node Truth Values, Typing and Comparison, Increment Ops, Expressions
+@section True and False in @code{awk}
+@cindex truth values
+@cindex logical true
+@cindex logical false
+
+Many programming languages have a special representation for the concepts
+of ``true'' and ``false.''  Such languages usually use the special
+constants @code{true} and @code{false}, or perhaps their upper-case
+equivalents.
+
+@cindex null string
+@cindex empty string
+@code{awk} is different.  It borrows a very simple concept of true and
+false from C.  In @code{awk}, any non-zero numeric value, @emph{or} any
+non-empty string value is true.  Any other value (zero or the null
+string, @code{""}) is false.  The following program will print @samp{A strange
+truth value} three times:
+
+@example
+BEGIN @{
+   if (3.1415927)
+       print "A strange truth value"
+   if ("Four Score And Seven Years Ago")
+       print "A strange truth value"
+   if (j = 57)
+       print "A strange truth value"
+@}
+@end example
+
+@cindex dark corner
+There is a surprising consequence of the ``non-zero or non-null'' rule:
+The string constant @code{"0"} is actually true, since it is non-null (d.c.).
+
+@node Typing and Comparison, Boolean Ops, Truth Values, Expressions
+@section Variable Typing and Comparison Expressions
+@cindex comparison expressions
+@cindex expression, comparison
+@cindex expression, matching
+@cindex relational operators
+@cindex operators, relational
+@cindex regexp match/non-match operators
+@cindex variable typing
+@cindex types of variables
+
+@c 2e: consider splitting this section into subsections
+
+Unlike other programming languages, @code{awk} variables do not have a
+fixed type. Instead, they can be either a number or a string, depending
+upon the value that is assigned to them.
+
+@cindex numeric string
+The 1992 POSIX standard introduced
+the concept of a @dfn{numeric string}, which is simply a string that looks
+like a number, for example, @code{@w{" +2"}}.  This concept is used
+for determining the type of a variable.
+
+The type of the variable is important, since the types of two variables
+determine how they are compared.
+
+In @code{gawk}, variable typing follows these rules.
+
+@enumerate 1
+@item
+A numeric literal or the result of a numeric operation has the @var{numeric}
+attribute.
+
+@item
+A string literal or the result of a string operation has the @var{string}
+attribute.
+
+@item
+Fields, @code{getline} input, @code{FILENAME}, @code{ARGV} elements,
+@code{ENVIRON} elements and the
+elements of an array created by @code{split} that are numeric strings
+have the @var{strnum} attribute.  Otherwise, they have the @var{string}
+attribute.
+Uninitialized variables also have the @var{strnum} attribute.
+
+@item
+Attributes propagate across assignments, but are not changed by
+any use.
+@c  (Although a use may cause the entity to acquire an additional
+@c value such that it has both a numeric and string value -- this leaves the
+@c attribute unchanged.)
+@c This is important but not relevant
+@end enumerate
+
+The last rule is particularly important. In the following program,
+@code{a} has numeric type, even though it is later used in a string
+operation.
+
+@example
+BEGIN @{
+         a = 12.345
+         b = a " is a cute number"
+         print b
+@}
+@end example
+
+When two operands are compared, either string comparison or numeric comparison
+may be used, depending on the attributes of the operands, according to the
+following, symmetric, matrix:
+
+@c thanks to Karl Berry, kb@cs.umb.edu, for major help with TeX tables
+@tex
+\centerline{
+\vbox{\bigskip % space above the table (about 1 linespace)
+% Because we have vertical rules, we can't let TeX insert interline space
+% in its usual way.
+\offinterlineskip
+%
+% Define the table template. & separates columns, and \cr ends the
+% template (and each row). # is replaced by the text of that entry on
+% each row. The template for the first column breaks down like this:
+%   \strut -- a way to make each line have the height and depth
+%             of a normal line of type, since we turned off interline spacing.
+%   \hfil -- infinite glue; has the effect of right-justifying in this case.
+%   #     -- replaced by the text (for instance, `STRNUM', in the last row).
+%   \quad -- about the width of an `M'. Just separates the columns.
+% 
+% The second column (\vrule#) is what generates the vertical rule that
+% spans table rows.
+% 
+% The doubled && before the next entry means `repeat the following
+% template as many times as necessary on each line' -- in our case, twice.
+% 
+% The template itself, \quad#\hfil, left-justifies with a little space before.
+% 
+\halign{\strut\hfil#\quad&\vrule#&&\quad#\hfil\cr
+	&&STRING	&NUMERIC	&STRNUM\cr
+% The \omit tells TeX to skip inserting the template for this column on
+% this particular row. In this case, we only want a little extra space
+% to separate the heading row from the rule below it.  the depth 2pt --
+% `\vrule depth 2pt' is that little space.
+\omit	&depth 2pt\cr
+% This is the horizontal rule below the heading. Since it has nothing to
+% do with the columns of the table, we use \noalign to get it in there.
+\noalign{\hrule}
+% Like above, this time a little more space.
+\omit	&depth 4pt\cr
+% The remaining rows have nothing special about them.
+STRING	&&string	&string		&string\cr
+NUMERIC	&&string	&numeric	&numeric\cr
+STRNUM  &&string	&numeric	&numeric\cr
+}}}
+@end tex
+@ifinfo
+@display
+	+----------------------------------------------
+	|	STRING		NUMERIC		STRNUM
+--------+----------------------------------------------
+	|
+STRING	|	string		string		string
+	|
+NUMERIC	|	string		numeric		numeric
+	|
+STRNUM	|	string		numeric		numeric
+--------+----------------------------------------------
+@end display
+@end ifinfo
+
+The basic idea is that user input that looks numeric, and @emph{only}
+user input, should be treated as numeric, even though it is actually
+made of characters, and is therefore also a string.
+
+@dfn{Comparison expressions} compare strings or numbers for
+relationships such as equality.  They are written using @dfn{relational
+operators}, which are a superset of those in C.  Here is a table of
+them:
+
+@cindex relational operators
+@cindex operators, relational
+@cindex @code{<} operator
+@cindex @code{<=} operator
+@cindex @code{>} operator
+@cindex @code{>=} operator
+@cindex @code{==} operator
+@cindex @code{!=} operator
+@cindex @code{~} operator
+@cindex @code{!~} operator
+@cindex @code{in} operator
+@c @cartouche
+@table @code
+@item @var{x} < @var{y}
+True if @var{x} is less than @var{y}.
+
+@item @var{x} <= @var{y}
+True if @var{x} is less than or equal to @var{y}.
+
+@item @var{x} > @var{y}
+True if @var{x} is greater than @var{y}.
+
+@item @var{x} >= @var{y}
+True if @var{x} is greater than or equal to @var{y}.
+
+@item @var{x} == @var{y}
+True if @var{x} is equal to @var{y}.
+
+@item @var{x} != @var{y}
+True if @var{x} is not equal to @var{y}.
+
+@item @var{x} ~ @var{y}
+True if the string @var{x} matches the regexp denoted by @var{y}.
+
+@item @var{x} !~ @var{y}
+True if the string @var{x} does not match the regexp denoted by @var{y}.
+
+@item @var{subscript} in @var{array}
+True if the array @var{array} has an element with the subscript @var{subscript}.
+@end table
+@c @end cartouche
+
+Comparison expressions have the value one if true and zero if false.
+
+When comparing operands of mixed types, numeric operands are converted
+to strings using the value of @code{CONVFMT}
+(@pxref{Conversion, ,Conversion of Strings and Numbers}).
+
+Strings are compared
+by comparing the first character of each, then the second character of each,
+and so on.  Thus @code{"10"} is less than @code{"9"}.  If there are two
+strings where one is a prefix of the other, the shorter string is less than
+the longer one.  Thus @code{"abc"} is less than @code{"abcd"}.
+
+@cindex common mistakes
+@cindex mistakes, common
+@cindex errors, common
+It is very easy to accidentally mistype the @samp{==} operator, and
+leave off one of the @samp{=}s.  The result is still valid @code{awk}
+code, but the program will not do what you mean:
+
+@example
+if (a = b)   # oops! should be a == b
+   @dots{}
+else
+   @dots{}
+@end example
+
+@noindent
+Unless @code{b} happens to be zero or the null string, the @code{if}
+part of the test will always succeed.  Because the operators are
+so similar, this kind of error is very difficult to spot when
+scanning the source code.
+
+Here are some sample expressions, how @code{gawk} compares them, and what
+the result of the comparison is.
+
+@table @code
+@item 1.5 <= 2.0
+numeric comparison (true)
+
+@item "abc" >= "xyz"
+string comparison (false)
+
+@item 1.5 != " +2"
+string comparison (true)
+
+@item "1e2" < "3"
+string comparison (true)
+
+@item a = 2; b = "2"
+@itemx a == b
+string comparison (true)
+
+@item a = 2; b = " +2"
+@itemx a == b
+string comparison (false)
+@end table
+
+In this example,
+
+@example
+@group
+$ echo 1e2 3 | awk '@{ print ($1 < $2) ? "true" : "false" @}'
+@print{} false
+@end group
+@end example
+
+@noindent
+the result is @samp{false} since both @code{$1} and @code{$2} are numeric
+strings and thus both have the @var{strnum} attribute,
+dictating a numeric comparison.
+
+The purpose of the comparison rules and the use of numeric strings is
+to attempt to produce the behavior that is ``least surprising,'' while
+still ``doing the right thing.''
+
+@cindex comparisons, string vs. regexp
+@cindex string comparison vs. regexp comparison
+@cindex regexp comparison vs. string comparison
+String comparisons and regular expression comparisons are very different.
+For example,
+
+@example
+x == "foo"
+@end example
+
+@noindent
+has the value of one, or is true, if the variable @code{x}
+is precisely @samp{foo}.  By contrast, 
+
+@example
+x ~ /foo/
+@end example
+
+@noindent
+has the value one if @code{x} contains @samp{foo}, such as
+@code{"Oh, what a fool am I!"}.
+
+The right hand operand of the @samp{~} and @samp{!~} operators may be
+either a regexp constant (@code{/@dots{}/}), or an ordinary
+expression, in which case the value of the expression as a string is used as a
+dynamic regexp (@pxref{Regexp Usage, ,How to Use Regular Expressions}; also
+@pxref{Computed Regexps, ,Using Dynamic Regexps}).
+
+@cindex regexp as expression
+In recent implementations of @code{awk}, a constant regular
+expression in slashes by itself is also an expression.  The regexp
+@code{/@var{regexp}/} is an abbreviation for this comparison expression:
+
+@example
+$0 ~ /@var{regexp}/
+@end example
+
+One special place where @code{/foo/} is @emph{not} an abbreviation for
+@samp{$0 ~ /foo/} is when it is the right-hand operand of @samp{~} or
+@samp{!~}!
+@xref{Using Constant Regexps, ,Using Regular Expression Constants},
+where this is discussed in more detail.
+
+@c This paragraph has been here since day 1, and has always bothered
+@c me, especially since the expression doesn't really make a lot of
+@c sense. So, just take it out.
+@ignore
+In some contexts it may be necessary to write parentheses around the
+regexp to avoid confusing the @code{gawk} parser.  For example,
+@samp{(/x/ - /y/) > threshold} is not allowed, but @samp{((/x/) - (/y/))
+> threshold} parses properly.
+@end ignore
+
+@node Boolean Ops, Conditional Exp, Typing and Comparison, Expressions
+@section Boolean Expressions
+@cindex expression, boolean
+@cindex boolean expressions
+@cindex operators, boolean
+@cindex boolean operators
+@cindex logical operations
+@cindex operations, logical
+@cindex short-circuit operators
+@cindex operators, short-circuit
+@cindex and operator
+@cindex or operator
+@cindex not operator
+@cindex @code{&&} operator
+@cindex @code{||} operator
+@cindex @code{!} operator
+
+A @dfn{boolean expression} is a combination of comparison expressions or
+matching expressions, using the boolean operators ``or''
+(@samp{||}), ``and'' (@samp{&&}), and ``not'' (@samp{!}), along with
+parentheses to control nesting.  The truth value of the boolean expression is
+computed by combining the truth values of the component expressions.
+Boolean expressions are also referred to as @dfn{logical expressions}.
+The terms are equivalent.
+
+Boolean expressions can be used wherever comparison and matching
+expressions can be used.  They can be used in @code{if}, @code{while},
+@code{do} and @code{for} statements
+(@pxref{Statements, ,Control Statements in Actions}).
+They have numeric values (one if true, zero if false), which come into play
+if the result of the boolean expression is stored in a variable, or
+used in arithmetic.
+
+In addition, every boolean expression is also a valid pattern, so
+you can use one as a pattern to control the execution of rules.
+
+Here are descriptions of the three boolean operators, with examples.
+
+@c @cartouche
+@table @code
+@item @var{boolean1} && @var{boolean2}
+True if both @var{boolean1} and @var{boolean2} are true.  For example,
+the following statement prints the current input record if it contains
+both @samp{2400} and @samp{foo}.
+
+@example
+if ($0 ~ /2400/ && $0 ~ /foo/) print
+@end example
+
+The subexpression @var{boolean2} is evaluated only if @var{boolean1}
+is true.  This can make a difference when @var{boolean2} contains
+expressions that have side effects: in the case of @samp{$0 ~ /foo/ &&
+($2 == bar++)}, the variable @code{bar} is not incremented if there is
+no @samp{foo} in the record.
+
+@item @var{boolean1} || @var{boolean2}
+True if at least one of @var{boolean1} or @var{boolean2} is true.
+For example, the following statement prints all records in the input
+that contain @emph{either} @samp{2400} or
+@samp{foo}, or both.
+
+@example
+if ($0 ~ /2400/ || $0 ~ /foo/) print
+@end example
+
+The subexpression @var{boolean2} is evaluated only if @var{boolean1}
+is false.  This can make a difference when @var{boolean2} contains
+expressions that have side effects.
+
+@item ! @var{boolean}
+True if @var{boolean} is false.  For example, the following program prints
+all records in the input file @file{BBS-list} that do @emph{not} contain the
+string @samp{foo}.
+
+@c A better example would be `if (! (subscript in array)) ...' but we
+@c haven't done anything with arrays or `in' yet. Sigh.
+@example
+awk '@{ if (! ($0 ~ /foo/)) print @}' BBS-list
+@end example
+@end table
+@c @end cartouche
+
+The @samp{&&} and @samp{||} operators are called @dfn{short-circuit}
+operators because of the way they work.  Evaluation of the full expression
+is ``short-circuited'' if the result can be determined part way through
+its evaluation.
+
+@cindex line continuation
+You can continue a statement that uses @samp{&&} or @samp{||} simply
+by putting a newline after them.  But you cannot put a newline in front
+of either of these operators without using backslash continuation
+(@pxref{Statements/Lines, ,@code{awk} Statements Versus Lines}).
+
+The actual value of an expression using the @samp{!} operator will be
+either one or zero, depending upon the truth value of the expression it
+is applied to.
+
+The @samp{!} operator is often useful for changing the sense of a flag
+variable from false to true and back again. For example, the following
+program is one way to print lines in between special bracketing lines:
+
+@example
+$1 == "START"   @{ interested = ! interested @}
+interested == 1 @{ print @}
+$1 == "END"     @{ interested = ! interested @}
+@end example
+
+@noindent
+The variable @code{interested}, like all @code{awk} variables, starts
+out initialized to zero, which is also false.  When a line is seen whose
+first field is @samp{START}, the value of @code{interested} is toggled
+to true, using @samp{!}. The next rule prints lines as long as
+@code{interested} is true.  When a line is seen whose first field is
+@samp{END}, @code{interested} is toggled back to false.
+@ignore
+We should discuss using `next' in the two rules that toggle the
+variable, to avoid printing the bracketing lines, but that's more
+distraction than really needed.
+@end ignore
+
+@node Conditional Exp, Function Calls, Boolean Ops, Expressions
+@section Conditional Expressions
+@cindex conditional expression
+@cindex expression, conditional
+
+A @dfn{conditional expression} is a special kind of expression with
+three operands.  It allows you to use one expression's value to select
+one of two other expressions.
+
+The conditional expression is the same as in the C language:
+
+@example
+@var{selector} ? @var{if-true-exp} : @var{if-false-exp}
+@end example
+
+@noindent
+There are three subexpressions.  The first, @var{selector}, is always
+computed first.  If it is ``true'' (not zero and not null) then
+@var{if-true-exp} is computed next and its value becomes the value of
+the whole expression.  Otherwise, @var{if-false-exp} is computed next
+and its value becomes the value of the whole expression.
+
+For example, this expression produces the absolute value of @code{x}:
+
+@example
+x > 0 ? x : -x
+@end example
+
+Each time the conditional expression is computed, exactly one of
+@var{if-true-exp} and @var{if-false-exp} is computed; the other is ignored.
+This is important when the expressions contain side effects.  For example,
+this conditional expression examines element @code{i} of either array
+@code{a} or array @code{b}, and increments @code{i}.
+
+@example
+x == y ? a[i++] : b[i++]
+@end example
+
+@noindent
+This is guaranteed to increment @code{i} exactly once, because each time
+only one of the two increment expressions is executed,
+and the other is not.
+@xref{Arrays, ,Arrays in @code{awk}}, 
+for more information about arrays.
+
+@cindex differences between @code{gawk} and @code{awk}
+@cindex line continuation
+As a minor @code{gawk} extension,
+you can continue a statement that uses @samp{?:} simply
+by putting a newline after either character.
+However, you cannot put a newline in front
+of either character without using backslash continuation
+(@pxref{Statements/Lines, ,@code{awk} Statements Versus Lines}).
+
+@node Function Calls, Precedence, Conditional Exp, Expressions
+@section Function Calls
+@cindex function call
+@cindex calling a function
+
+A @dfn{function} is a name for a particular calculation.  Because it has
+a name, you can ask for it by name at any point in the program.  For
+example, the function @code{sqrt} computes the square root of a number.
+
+A fixed set of functions are @dfn{built-in}, which means they are
+available in every @code{awk} program.  The @code{sqrt} function is one
+of these.  @xref{Built-in, ,Built-in Functions}, for a list of built-in
+functions and their descriptions.  In addition, you can define your own
+functions for use in your program.
+@xref{User-defined, ,User-defined Functions}, for how to do this.
+
+@cindex arguments in function call
+The way to use a function is with a @dfn{function call} expression,
+which consists of the function name followed immediately by a list of
+@dfn{arguments} in parentheses.  The arguments are expressions which
+provide the raw materials for the function's calculations.
+When there is more than one argument, they are separated by commas.  If
+there are no arguments, write just @samp{()} after the function name.
+Here are some examples:
+
+@example
+sqrt(x^2 + y^2)        @i{one argument}
+atan2(y, x)            @i{two arguments}
+rand()                 @i{no arguments}
+@end example
+
+@strong{Do not put any space between the function name and the
+open-parenthesis!}  A user-defined function name looks just like the name of
+a variable, and space would make the expression look like concatenation
+of a variable with an expression inside parentheses.  Space before the
+parenthesis is harmless with built-in functions, but it is best not to get
+into the habit of using space to avoid mistakes with user-defined
+functions. 
+
+Each function expects a particular number of arguments.  For example, the
+@code{sqrt} function must be called with a single argument, the number
+to take the square root of:
+
+@example
+sqrt(@var{argument})
+@end example
+
+Some of the built-in functions allow you to omit the final argument.
+If you do so, they use a reasonable default.
+@xref{Built-in, ,Built-in Functions}, for full details.  If arguments
+are omitted in calls to user-defined functions, then those arguments are
+treated as local variables, initialized to the empty string
+(@pxref{User-defined, ,User-defined Functions}).
+
+Like every other expression, the function call has a value, which is
+computed by the function based on the arguments you give it.  In this
+example, the value of @samp{sqrt(@var{argument})} is the square root of
+@var{argument}.  A function can also have side effects, such as assigning
+values to certain variables or doing I/O.
+
+Here is a command to read numbers, one number per line, and print the
+square root of each one:
+
+@example
+@group
+$ awk '@{ print "The square root of", $1, "is", sqrt($1) @}'
+1
+@print{} The square root of 1 is 1
+3
+@print{} The square root of 3 is 1.73205
+5
+@print{} The square root of 5 is 2.23607
+@kbd{Control-d}
+@end group
+@end example
+
+@node Precedence,  , Function Calls, Expressions
+@section Operator Precedence (How Operators Nest)
+@cindex precedence
+@cindex operator precedence
+
+@dfn{Operator precedence} determines how operators are grouped, when
+different operators appear close by in one expression.  For example,
+@samp{*} has higher precedence than @samp{+}; thus, @samp{a + b * c}
+means to multiply @code{b} and @code{c}, and then add @code{a} to the
+product (i.e.@: @samp{a + (b * c)}).
+
+You can overrule the precedence of the operators by using parentheses.
+You can think of the precedence rules as saying where the
+parentheses are assumed to be if you do not write parentheses yourself.  In
+fact, it is wise to always use parentheses whenever you have an unusual
+combination of operators, because other people who read the program may
+not remember what the precedence is in this case.  You might forget,
+too; then you could make a mistake.  Explicit parentheses will help prevent
+any such mistake.
+
+When operators of equal precedence are used together, the leftmost
+operator groups first, except for the assignment, conditional and
+exponentiation operators, which group in the opposite order.
+Thus, @samp{a - b + c} groups as @samp{(a - b) + c}, and
+@samp{a = b = c} groups as @samp{a = (b = c)}.
+
+The precedence of prefix unary operators does not matter as long as only
+unary operators are involved, because there is only one way to interpret
+them---innermost first.  Thus, @samp{$++i} means @samp{$(++i)} and
+@samp{++$x} means @samp{++($x)}.  However, when another operator follows
+the operand, then the precedence of the unary operators can matter.
+Thus, @samp{$x^2} means @samp{($x)^2}, but @samp{-x^2} means
+@samp{-(x^2)}, because @samp{-} has lower precedence than @samp{^}
+while @samp{$} has higher precedence.
+
+Here is a table of @code{awk}'s operators, in order from highest
+precedence to lowest:
+
+@c use @code in the items, looks better in TeX w/o all the quotes
+@table @code
+@item (@dots{})
+Grouping.
+
+@item $
+Field.
+
+@item ++ --
+Increment, decrement.
+
+@cindex @code{awk} language, POSIX version
+@cindex POSIX @code{awk}
+@item ^ **
+Exponentiation.  These operators group right-to-left.
+(The @samp{**} operator is not specified by POSIX.)
+
+@item + - !
+Unary plus, minus, logical ``not''.
+
+@item * / %
+Multiplication, division, modulus.
+
+@item + -
+Addition, subtraction.
+
+@item @r{Concatenation}
+No special token is used to indicate concatenation.
+The operands are simply written side by side.
+
+@item < <= == !=
+@itemx > >= >> |
+Relational, and redirection.
+The relational operators and the redirections have the same precedence
+level.  Characters such as @samp{>} serve both as relationals and as
+redirections; the context distinguishes between the two meanings.
+
+Note that the I/O redirection operators in @code{print} and @code{printf}
+statements belong to the statement level, not to expressions.  The
+redirection does not produce an expression which could be the operand of
+another operator.  As a result, it does not make sense to use a
+redirection operator near another operator of lower precedence, without
+parentheses.  Such combinations, for example @samp{print foo > a ? b : c},
+result in syntax errors.
+The correct way to write this statement is @samp{print foo > (a ? b : c)}.
+
+@item ~ !~
+Matching, non-matching.
+
+@item in
+Array membership.
+
+@item &&
+Logical ``and''.
+
+@item ||
+Logical ``or''.
+
+@item ?:
+Conditional.  This operator groups right-to-left.
+
+@cindex @code{awk} language, POSIX version
+@cindex POSIX @code{awk}
+@item = += -= *=
+@itemx /= %= ^= **=
+Assignment.  These operators group right-to-left.
+(The @samp{**=} operator is not specified by POSIX.)
+@end table
+
+@node Patterns and Actions, Statements, Expressions, Top
+@chapter Patterns and Actions
+@cindex pattern, definition of
+
+As you have already seen, each @code{awk} statement consists of
+a pattern with an associated action.  This chapter describes how
+you build patterns and actions.
+
+@menu
+* Pattern Overview::            What goes into a pattern.
+* Action Overview::             What goes into an action.
+@end menu
+
+@node Pattern Overview, Action Overview, Patterns and Actions, Patterns and Actions
+@section Pattern Elements
+
+Patterns in @code{awk} control the execution of rules: a rule is
+executed when its pattern matches the current input record.  This
+section explains all about how to write patterns.
+
+@menu
+* Kinds of Patterns::           A list of all kinds of patterns.
+* Regexp Patterns::             Using regexps as patterns.
+* Expression Patterns::         Any expression can be used as a pattern.
+* Ranges::                      Pairs of patterns specify record ranges.
+* BEGIN/END::                   Specifying initialization and cleanup rules.
+* Empty::                       The empty pattern, which matches every record.
+@end menu
+
+@node Kinds of Patterns, Regexp Patterns, Pattern Overview, Pattern Overview
+@subsection Kinds of Patterns
+@cindex patterns, types of
+
+Here is a summary of the types of patterns supported in @code{awk}.
+
+@table @code
+@item /@var{regular expression}/
+A regular expression as a pattern.  It matches when the text of the
+input record fits the regular expression.
+(@xref{Regexp, ,Regular Expressions}.)
+
+@item @var{expression}
+A single expression.  It matches when its value
+is non-zero (if a number) or non-null (if a string).
+(@xref{Expression Patterns, ,Expressions as Patterns}.)
+
+@item @var{pat1}, @var{pat2}
+A pair of patterns separated by a comma, specifying a range of records.
+The range includes both the initial record that matches @var{pat1}, and
+the final record that matches @var{pat2}.
+(@xref{Ranges, ,Specifying Record Ranges with Patterns}.)
+
+@item BEGIN
+@itemx END
+Special patterns for you to supply start-up or clean-up actions for your
+@code{awk} program.
+(@xref{BEGIN/END, ,The @code{BEGIN} and @code{END} Special Patterns}.)
+
+@item @var{empty}
+The empty pattern matches every input record.
+(@xref{Empty, ,The Empty Pattern}.)
+@end table
+
+@node Regexp Patterns, Expression Patterns, Kinds of Patterns, Pattern Overview
+@subsection Regular Expressions as Patterns
+
+We have been using regular expressions as patterns since our early examples.
+This kind of pattern is simply a regexp constant in the pattern part of
+a rule.  Its  meaning is @samp{$0 ~ /@var{pattern}/}.
+The pattern matches when the input record matches the regexp.
+For example:
+
+@example
+/foo|bar|baz/  @{ buzzwords++ @}
+END            @{ print buzzwords, "buzzwords seen" @}
+@end example
+
+@node Expression Patterns, Ranges, Regexp Patterns, Pattern Overview
+@subsection Expressions as Patterns
+
+Any @code{awk} expression is valid as an @code{awk} pattern.
+Then the pattern matches if the expression's value is non-zero (if a
+number) or non-null (if a string).
+
+The expression is reevaluated each time the rule is tested against a new
+input record.  If the expression uses fields such as @code{$1}, the
+value depends directly on the new input record's text; otherwise, it
+depends only on what has happened so far in the execution of the
+@code{awk} program, but that may still be useful.
+
+A very common kind of expression used as a pattern is the comparison
+expression, using the comparison operators described in
+@ref{Typing and Comparison, ,Variable Typing and Comparison Expressions}.
+
+Regexp matching and non-matching are also very common expressions.
+The left operand of the @samp{~} and @samp{!~} operators is a string.
+The right operand is either a constant regular expression enclosed in
+slashes (@code{/@var{regexp}/}), or any expression, whose string value
+is used as a dynamic regular expression
+(@pxref{Computed Regexps, , Using Dynamic Regexps}).
+
+The following example prints the second field of each input record
+whose first field is precisely @samp{foo}.
+
+@example
+$ awk '$1 == "foo" @{ print $2 @}' BBS-list
+@end example
+
+@noindent
+(There is no output, since there is no BBS site named ``foo''.)
+Contrast this with the following regular expression match, which would
+accept any record with a first field that contains @samp{foo}:
+
+@example
+@group
+$ awk '$1 ~ /foo/ @{ print $2 @}' BBS-list
+@print{} 555-1234
+@print{} 555-6699
+@print{} 555-6480
+@print{} 555-2127
+@end group
+@end example
+
+Boolean expressions are also commonly used as patterns.
+Whether the pattern
+matches an input record depends on whether its subexpressions match.
+
+For example, the following command prints all records in
+@file{BBS-list} that contain both @samp{2400} and @samp{foo}.
+
+@example
+$ awk '/2400/ && /foo/' BBS-list
+@print{} fooey        555-1234     2400/1200/300     B
+@end example
+
+The following command prints all records in
+@file{BBS-list} that contain @emph{either} @samp{2400} or @samp{foo}, or
+both.
+
+@example
+@group
+$ awk '/2400/ || /foo/' BBS-list
+@print{} alpo-net     555-3412     2400/1200/300     A
+@print{} bites        555-1675     2400/1200/300     A
+@print{} fooey        555-1234     2400/1200/300     B
+@print{} foot         555-6699     1200/300          B
+@print{} macfoo       555-6480     1200/300          A
+@print{} sdace        555-3430     2400/1200/300     A
+@print{} sabafoo      555-2127     1200/300          C
+@end group
+@end example
+
+The following command prints all records in
+@file{BBS-list} that do @emph{not} contain the string @samp{foo}.
+
+@example
+@group
+$ awk '! /foo/' BBS-list
+@print{} aardvark     555-5553     1200/300          B
+@print{} alpo-net     555-3412     2400/1200/300     A
+@print{} barfly       555-7685     1200/300          A
+@print{} bites        555-1675     2400/1200/300     A
+@print{} camelot      555-0542     300               C
+@print{} core         555-2912     1200/300          C
+@print{} sdace        555-3430     2400/1200/300     A
+@end group
+@end example
+
+The subexpressions of a boolean operator in a pattern can be constant regular
+expressions, comparisons, or any other @code{awk} expressions.  Range
+patterns are not expressions, so they cannot appear inside boolean
+patterns.  Likewise, the special patterns @code{BEGIN} and @code{END},
+which never match any input record, are not expressions and cannot
+appear inside boolean patterns.
+
+A regexp constant as a pattern is also a special case of an expression
+pattern.  @code{/foo/} as an expression has the value one if @samp{foo}
+appears in the current input record; thus, as a pattern, @code{/foo/}
+matches any record containing @samp{foo}.
+
+@node Ranges, BEGIN/END, Expression Patterns, Pattern Overview
+@subsection Specifying Record Ranges with Patterns
+
+@cindex range pattern
+@cindex pattern, range
+@cindex matching ranges of lines
+A @dfn{range pattern} is made of two patterns separated by a comma, of
+the form @samp{@var{begpat}, @var{endpat}}.  It matches ranges of
+consecutive input records.  The first pattern, @var{begpat}, controls
+where the range begins, and the second one, @var{endpat}, controls where
+it ends.  For example,
+
+@example
+awk '$1 == "on", $1 == "off"'
+@end example
+
+@noindent
+prints every record between @samp{on}/@samp{off} pairs, inclusive.
+
+A range pattern starts out by matching @var{begpat}
+against every input record; when a record matches @var{begpat}, the
+range pattern becomes @dfn{turned on}.  The range pattern matches this
+record.  As long as it stays turned on, it automatically matches every
+input record read.  It also matches @var{endpat} against
+every input record; when that succeeds, the range pattern is turned
+off again for the following record.  Then it goes back to checking
+@var{begpat} against each record.
+
+The record that turns on the range pattern and the one that turns it
+off both match the range pattern.  If you don't want to operate on
+these records, you can write @code{if} statements in the rule's action
+to distinguish them from the records you are interested in.
+
+It is possible for a pattern to be turned both on and off by the same
+record, if the record satisfies both conditions.  Then the action is
+executed for just that record.
+
+For example, suppose you have text between two identical markers (say
+the @samp{%} symbol) that you wish to ignore.  You might try to
+combine a range pattern that describes the delimited text with the
+@code{next} statement
+(not discussed yet, @pxref{Next Statement, , The @code{next} Statement}),
+which causes @code{awk} to skip any further processing of the current
+record and start over again with the next input record. Such a program
+would like this:
+
+@example
+/^%$/,/^%$/    @{ next @}
+               @{ print @}
+@end example
+
+@noindent
+@cindex skipping lines between markers
+This program fails because the range pattern is both turned on and turned off
+by the first line with just a @samp{%} on it.  To accomplish this task, you
+must write the program this way, using a flag:
+
+@example
+/^%$/     @{ skip = ! skip; next @}
+skip == 1 @{ next @} # skip lines with `skip' set
+@end example
+
+Note that in a range pattern, the @samp{,} has the lowest precedence
+(is evaluated last) of all the operators.  Thus, for example, the
+following program attempts to combine a range pattern with another,
+simpler test.
+
+@example
+echo Yes | awk '/1/,/2/ || /Yes/'
+@end example
+
+The author of this program intended it to mean @samp{(/1/,/2/) || /Yes/}.
+However, @code{awk} interprets this as @samp{/1/, (/2/ || /Yes/)}.
+This cannot be changed or worked around; range patterns do not combine
+with other patterns.
+
+@node BEGIN/END, Empty, Ranges, Pattern Overview
+@subsection The @code{BEGIN} and @code{END} Special Patterns
+
+@cindex @code{BEGIN} special pattern
+@cindex pattern, @code{BEGIN}
+@cindex @code{END} special pattern
+@cindex pattern, @code{END}
+@code{BEGIN} and @code{END} are special patterns.  They are not used to
+match input records.  Rather, they supply start-up or
+clean-up actions for your @code{awk} script.
+
+@menu
+* Using BEGIN/END::             How and why to use BEGIN/END rules.
+* I/O And BEGIN/END::           I/O issues in BEGIN/END rules.
+@end menu
+
+@node Using BEGIN/END, I/O And BEGIN/END, BEGIN/END, BEGIN/END
+@subsubsection Startup and Cleanup Actions
+
+A @code{BEGIN} rule is executed, once, before the first input record
+has been read.  An @code{END} rule is executed, once, after all the
+input has been read.  For example:
+
+@example
+@group
+$ awk '
+> BEGIN @{ print "Analysis of \"foo\"" @}
+> /foo/ @{ ++n @}
+> END   @{ print "\"foo\" appears " n " times." @}' BBS-list
+@print{} Analysis of "foo"
+@print{} "foo" appears 4 times.
+@end group
+@end example
+
+This program finds the number of records in the input file @file{BBS-list}
+that contain the string @samp{foo}.  The @code{BEGIN} rule prints a title
+for the report.  There is no need to use the @code{BEGIN} rule to
+initialize the counter @code{n} to zero, as @code{awk} does this
+automatically (@pxref{Variables}).
+
+The second rule increments the variable @code{n} every time a
+record containing the pattern @samp{foo} is read.  The @code{END} rule
+prints the value of @code{n} at the end of the run.
+
+The special patterns @code{BEGIN} and @code{END} cannot be used in ranges
+or with boolean operators (indeed, they cannot be used with any operators).
+
+An @code{awk} program may have multiple @code{BEGIN} and/or @code{END}
+rules.  They are executed in the order they appear, all the @code{BEGIN}
+rules at start-up and all the @code{END} rules at termination.
+@code{BEGIN} and @code{END} rules may be intermixed with other rules.
+This feature was added in the 1987 version of @code{awk}, and is included
+in the POSIX standard.  The original (1978) version of @code{awk}
+required you to put the @code{BEGIN} rule at the beginning of the
+program, and the @code{END} rule at the end, and only allowed one of
+each.  This is no longer required, but it is a good idea in terms of
+program organization and readability.
+
+Multiple @code{BEGIN} and @code{END} rules are useful for writing
+library functions, since each library file can have its own @code{BEGIN} and/or
+@code{END} rule to do its own initialization and/or cleanup.  Note that
+the order in which library functions are named on the command line
+controls the order in which their @code{BEGIN} and @code{END} rules are
+executed.  Therefore you have to be careful to write such rules in
+library files so that the order in which they are executed doesn't matter.
+@xref{Options, ,Command Line Options}, for more information on
+using library functions.
+@xref{Library Functions, ,A Library of @code{awk} Functions},
+for a number of useful library functions.
+
+@cindex dark corner
+If an @code{awk} program only has a @code{BEGIN} rule, and no other
+rules, then the program exits after the @code{BEGIN} rule has been run.
+(The original version of @code{awk} used to keep reading and ignoring input
+until end of file was seen.)  However, if an @code{END} rule exists,
+then the input will be read, even if there are no other rules in
+the program.  This is necessary in case the @code{END} rule checks the
+@code{FNR} and @code{NR} variables (d.c.).
+
+@code{BEGIN} and @code{END} rules must have actions; there is no default
+action for these rules since there is no current record when they run.
+
+@node I/O And BEGIN/END, , Using BEGIN/END, BEGIN/END
+@subsubsection Input/Output from @code{BEGIN} and @code{END} Rules
+
+@cindex I/O from @code{BEGIN} and @code{END}
+There are several (sometimes subtle) issues involved when doing I/O
+from a @code{BEGIN} or @code{END} rule.
+
+The first has to do with the value of @code{$0} in a @code{BEGIN}
+rule.  Since @code{BEGIN} rules are executed before any input is read,
+there simply is no input record, and therefore no fields, when
+executing @code{BEGIN} rules.  References to @code{$0} and the fields
+yield a null string or zero, depending upon the context.  One way
+to give @code{$0} a real value is to execute a @code{getline} command
+without a variable (@pxref{Getline, ,Explicit Input with @code{getline}}).
+Another way is to simply assign a value to it.
+
+@cindex differences between @code{gawk} and @code{awk}
+The second point is similar to the first, but from the other direction.
+Inside an @code{END} rule, what is the value of @code{$0} and @code{NF}?
+Traditionally, due largely to implementation issues, @code{$0} and
+@code{NF} were @emph{undefined} inside an @code{END} rule.
+The POSIX standard specified that @code{NF} was available in an @code{END}
+rule, containing the number of fields from the last input record.
+Due most probably to an oversight, the standard does not say that @code{$0}
+is also preserved, although logically one would think that it should be.
+In fact, @code{gawk} does preserve the value of @code{$0} for use in
+@code{END} rules.  Be aware, however, that Unix @code{awk}, and possibly
+other implementations, do not.
+
+The third point follows from the first two.  What is the meaning of
+@samp{print} inside a @code{BEGIN} or @code{END} rule?  The meaning is
+the same as always, @samp{print $0}.  If @code{$0} is the null string,
+then this prints an empty line.  Many long time @code{awk} programmers
+use @samp{print} in @code{BEGIN} and @code{END} rules, to mean
+@samp{@w{print ""}}, relying on @code{$0} being null.  While you might
+generally get away with this in @code{BEGIN} rules, in @code{gawk} at
+least, it is a very bad idea in @code{END} rules.  It is also poor
+style, since if you want an empty line in the output, you
+should say so explicitly in your program.
+
+@node Empty,  , BEGIN/END, Pattern Overview
+@subsection The Empty Pattern
+
+@cindex empty pattern
+@cindex pattern, empty
+An empty (i.e.@: non-existent) pattern is considered to match @emph{every}
+input record.  For example, the program:
+
+@example
+awk '@{ print $1 @}' BBS-list
+@end example
+
+@noindent
+prints the first field of every record.
+
+@node Action Overview,  , Pattern Overview, Patterns and Actions
+@section Overview of Actions
+@cindex action, definition of
+@cindex curly braces
+@cindex action, curly braces
+@cindex action, separating statements
+
+An @code{awk} program or script consists of a series of
+rules and function definitions, interspersed.  (Functions are
+described later.  @xref{User-defined, ,User-defined Functions}.)
+
+A rule contains a pattern and an action, either of which (but not
+both) may be
+omitted.  The purpose of the @dfn{action} is to tell @code{awk} what to do
+once a match for the pattern is found.  Thus, in outline, an @code{awk}
+program generally looks like this:
+
+@example
+@r{[}@var{pattern}@r{]} @r{[}@{ @var{action} @}@r{]}
+@r{[}@var{pattern}@r{]} @r{[}@{ @var{action} @}@r{]}
+@dots{}
+function @var{name}(@var{args}) @{ @dots{} @}
+@dots{}
+@end example
+
+An action consists of one or more @code{awk} @dfn{statements}, enclosed
+in curly braces (@samp{@{} and @samp{@}}).  Each statement specifies one
+thing to be done.  The statements are separated by newlines or
+semicolons.
+
+The curly braces around an action must be used even if the action
+contains only one statement, or even if it contains no statements at
+all.  However, if you omit the action entirely, omit the curly braces as
+well.  An omitted action is equivalent to @samp{@{ print $0 @}}.
+
+@example
+/foo/  @{ @}  # match foo, do nothing - empty action
+/foo/       # match foo, print the record - omitted action
+@end example
+
+Here are the kinds of statements supported in @code{awk}:
+
+@itemize @bullet
+@item
+Expressions, which can call functions or assign values to variables
+(@pxref{Expressions}).  Executing
+this kind of statement simply computes the value of the expression.
+This is useful when the expression has side effects
+(@pxref{Assignment Ops, ,Assignment Expressions}).
+
+@item
+Control statements, which specify the control flow of @code{awk}
+programs.  The @code{awk} language gives you C-like constructs
+(@code{if}, @code{for}, @code{while}, and @code{do}) as well as a few
+special ones (@pxref{Statements, ,Control Statements in Actions}).
+
+@item
+Compound statements, which consist of one or more statements enclosed in
+curly braces.  A compound statement is used in order to put several
+statements together in the body of an @code{if}, @code{while}, @code{do}
+or @code{for} statement.
+
+@item
+Input statements, using the @code{getline} command
+(@pxref{Getline, ,Explicit Input with @code{getline}}), the @code{next}
+statement (@pxref{Next Statement, ,The @code{next} Statement}),
+and the @code{nextfile} statement
+(@pxref{Nextfile Statement, ,The @code{nextfile} Statement}).
+
+@item
+Output statements, @code{print} and @code{printf}.
+@xref{Printing, ,Printing Output}.
+
+@item
+Deletion statements, for deleting array elements.
+@xref{Delete, ,The @code{delete} Statement}.
+@end itemize
+
+@iftex
+The next chapter covers control statements in detail.
+@end iftex
+
+@node Statements, Built-in Variables, Patterns and Actions, Top
+@chapter Control Statements in Actions
+@cindex control statement
+
+@dfn{Control statements} such as @code{if}, @code{while}, and so on
+control the flow of execution in @code{awk} programs.  Most of the
+control statements in @code{awk} are patterned on similar statements in
+C.
+
+All the control statements start with special keywords such as @code{if}
+and @code{while}, to distinguish them from simple expressions.
+
+@cindex compound statement
+@cindex statement, compound
+Many control statements contain other statements; for example, the
+@code{if} statement contains another statement which may or may not be
+executed.  The contained statement is called the @dfn{body}.  If you
+want to include more than one statement in the body, group them into a
+single @dfn{compound statement} with curly braces, separating them with
+newlines or semicolons.
+
+@menu
+* If Statement::                Conditionally execute some @code{awk}
+                                statements.
+* While Statement::             Loop until some condition is satisfied.
+* Do Statement::                Do specified action while looping until some
+                                condition is satisfied.
+* For Statement::               Another looping statement, that provides
+                                initialization and increment clauses.
+* Break Statement::             Immediately exit the innermost enclosing loop.
+* Continue Statement::          Skip to the end of the innermost enclosing
+                                loop.
+* Next Statement::              Stop processing the current input record.
+* Nextfile Statement::          Stop processing the current file.
+* Exit Statement::              Stop execution of @code{awk}.
+@end menu
+
+@node If Statement, While Statement, Statements, Statements
+@section The @code{if}-@code{else} Statement
+
+@cindex @code{if}-@code{else} statement
+The @code{if}-@code{else} statement is @code{awk}'s decision-making
+statement.  It looks like this:
+
+@example
+if (@var{condition}) @var{then-body} @r{[}else @var{else-body}@r{]}
+@end example
+
+@noindent
+The @var{condition} is an expression that controls what the rest of the
+statement will do.  If @var{condition} is true, @var{then-body} is
+executed; otherwise, @var{else-body} is executed.
+The @code{else} part of the statement is
+optional.  The condition is considered false if its value is zero or
+the null string, and true otherwise.
+
+Here is an example:
+
+@example
+if (x % 2 == 0)
+    print "x is even"
+else
+    print "x is odd"
+@end example
+
+In this example, if the expression @samp{x % 2 == 0} is true (that is,
+the value of @code{x} is evenly divisible by two), then the first @code{print}
+statement is executed, otherwise the second @code{print} statement is
+executed.
+
+If the @code{else} appears on the same line as @var{then-body}, and
+@var{then-body} is not a compound statement (i.e.@: not surrounded by
+curly braces), then a semicolon must separate @var{then-body} from
+@code{else}.  To illustrate this, let's rewrite the previous example:
+
+@example
+if (x % 2 == 0) print "x is even"; else
+        print "x is odd"
+@end example
+
+@noindent
+If you forget the @samp{;}, @code{awk} won't be able to interpret the
+statement, and you will get a syntax error.
+
+We would not actually write this example this way, because a human
+reader might fail to see the @code{else} if it were not the first thing
+on its line.
+
+@node While Statement, Do Statement, If Statement, Statements
+@section The @code{while} Statement
+@cindex @code{while} statement
+@cindex loop
+@cindex body of a loop
+
+In programming, a @dfn{loop} means a part of a program that can
+be executed two or more times in succession.
+
+The @code{while} statement is the simplest looping statement in
+@code{awk}.  It repeatedly executes a statement as long as a condition is
+true.  It looks like this:
+
+@example
+while (@var{condition})
+  @var{body}
+@end example
+
+@noindent
+Here @var{body} is a statement that we call the @dfn{body} of the loop,
+and @var{condition} is an expression that controls how long the loop
+keeps running.
+
+The first thing the @code{while} statement does is test @var{condition}.
+If @var{condition} is true, it executes the statement @var{body}.
+@ifinfo
+(The @var{condition} is true when the value 
+is not zero and not a null string.)
+@end ifinfo
+After @var{body} has been executed,
+@var{condition} is tested again, and if it is still true, @var{body} is
+executed again.  This process repeats until @var{condition} is no longer
+true.  If @var{condition} is initially false, the body of the loop is
+never executed, and @code{awk} continues with the statement following
+the loop.
+
+This example prints the first three fields of each record, one per line.
+
+@example
+awk '@{ i = 1
+       while (i <= 3) @{
+           print $i
+           i++
+       @}
+@}' inventory-shipped
+@end example
+
+@noindent
+Here the body of the loop is a compound statement enclosed in braces,
+containing two statements.
+
+The loop works like this: first, the value of @code{i} is set to one.
+Then, the @code{while} tests whether @code{i} is less than or equal to
+three.  This is true when @code{i} equals one, so the @code{i}-th
+field is printed.  Then the @samp{i++} increments the value of @code{i}
+and the loop repeats.  The loop terminates when @code{i} reaches four.
+
+As you can see, a newline is not required between the condition and the
+body; but using one makes the program clearer unless the body is a
+compound statement or is very simple.  The newline after the open-brace
+that begins the compound statement is not required either, but the
+program would be harder to read without it.
+
+@node Do Statement, For Statement, While Statement, Statements
+@section The @code{do}-@code{while} Statement
+
+The @code{do} loop is a variation of the @code{while} looping statement.
+The @code{do} loop executes the @var{body} once, and then repeats @var{body}
+as long as @var{condition} is true.  It looks like this:
+
+@example
+do
+  @var{body}
+while (@var{condition})
+@end example
+
+Even if @var{condition} is false at the start, @var{body} is executed at
+least once (and only once, unless executing @var{body} makes
+@var{condition} true).  Contrast this with the corresponding
+@code{while} statement:
+
+@example
+while (@var{condition})
+  @var{body}
+@end example
+
+@noindent
+This statement does not execute @var{body} even once if @var{condition}
+is false to begin with.
+
+Here is an example of a @code{do} statement:
+
+@example
+awk '@{ i = 1
+       do @{
+          print $0
+          i++
+       @} while (i <= 10)
+@}'
+@end example
+
+@noindent
+This program prints each input record ten times.  It isn't a very
+realistic example, since in this case an ordinary @code{while} would do
+just as well.  But this reflects actual experience; there is only
+occasionally a real use for a @code{do} statement.
+
+@node For Statement, Break Statement, Do Statement, Statements
+@section The @code{for} Statement
+@cindex @code{for} statement
+
+The @code{for} statement makes it more convenient to count iterations of a
+loop.  The general form of the @code{for} statement looks like this:
+
+@example
+for (@var{initialization}; @var{condition}; @var{increment})
+  @var{body}
+@end example
+
+@noindent
+The @var{initialization}, @var{condition} and @var{increment} parts are
+arbitrary @code{awk} expressions, and @var{body} stands for any
+@code{awk} statement.
+
+The @code{for} statement starts by executing @var{initialization}.
+Then, as long
+as @var{condition} is true, it repeatedly executes @var{body} and then
+@var{increment}.  Typically @var{initialization} sets a variable to
+either zero or one, @var{increment} adds one to it, and @var{condition}
+compares it against the desired number of iterations.
+
+Here is an example of a @code{for} statement:
+
+@example
+@group
+awk '@{ for (i = 1; i <= 3; i++)
+          print $i
+@}' inventory-shipped
+@end group
+@end example
+
+@noindent
+This prints the first three fields of each input record, one field per
+line.
+
+You cannot set more than one variable in the
+@var{initialization} part unless you use a multiple assignment statement
+such as @samp{x = y = 0}, which is possible only if all the initial values
+are equal.  (But you can initialize additional variables by writing
+their assignments as separate statements preceding the @code{for} loop.)
+
+The same is true of the @var{increment} part; to increment additional
+variables, you must write separate statements at the end of the loop.
+The C compound expression, using C's comma operator, would be useful in
+this context, but it is not supported in @code{awk}.
+
+Most often, @var{increment} is an increment expression, as in the
+example above.  But this is not required; it can be any expression
+whatever.  For example, this statement prints all the powers of two
+between one and 100:
+
+@example
+for (i = 1; i <= 100; i *= 2)
+  print i
+@end example
+
+Any of the three expressions in the parentheses following the @code{for} may
+be omitted if there is nothing to be done there.  Thus, @w{@samp{for (; x
+> 0;)}} is equivalent to @w{@samp{while (x > 0)}}.  If the
+@var{condition} is omitted, it is treated as @var{true}, effectively
+yielding an @dfn{infinite loop} (i.e.@: a loop that will never
+terminate).
+
+In most cases, a @code{for} loop is an abbreviation for a @code{while}
+loop, as shown here:
+
+@example
+@var{initialization}
+while (@var{condition}) @{
+  @var{body}
+  @var{increment}
+@}
+@end example
+
+@noindent
+The only exception is when the @code{continue} statement
+(@pxref{Continue Statement, ,The @code{continue} Statement}) is used
+inside the loop; changing a @code{for} statement to a @code{while}
+statement in this way can change the effect of the @code{continue}
+statement inside the loop.
+
+There is an alternate version of the @code{for} loop, for iterating over
+all the indices of an array:
+
+@example
+for (i in array)
+    @var{do something with} array[i]
+@end example
+
+@noindent
+@xref{Scanning an Array, ,Scanning All Elements of an Array},
+for more information on this version of the @code{for} loop.
+
+The @code{awk} language has a @code{for} statement in addition to a
+@code{while} statement because often a @code{for} loop is both less work to
+type and more natural to think of.  Counting the number of iterations is
+very common in loops.  It can be easier to think of this counting as part
+of looping rather than as something to do inside the loop.
+
+The next section has more complicated examples of @code{for} loops.
+
+@node Break Statement, Continue Statement, For Statement, Statements
+@section The @code{break} Statement
+@cindex @code{break} statement
+@cindex loops, exiting
+
+The @code{break} statement jumps out of the innermost @code{for},
+@code{while}, or @code{do} loop that encloses it.  The
+following example finds the smallest divisor of any integer, and also
+identifies prime numbers:
+
+@example
+awk '# find smallest divisor of num
+     @{ num = $1
+       for (div = 2; div*div <= num; div++)
+         if (num % div == 0)
+           break
+       if (num % div == 0)
+         printf "Smallest divisor of %d is %d\n", num, div
+       else
+         printf "%d is prime\n", num
+     @}'
+@end example
+
+When the remainder is zero in the first @code{if} statement, @code{awk}
+immediately @dfn{breaks out} of the containing @code{for} loop.  This means
+that @code{awk} proceeds immediately to the statement following the loop
+and continues processing.  (This is very different from the @code{exit}
+statement which stops the entire @code{awk} program.  
+@xref{Exit Statement, ,The @code{exit} Statement}.)
+
+Here is another program equivalent to the previous one.  It illustrates how
+the @var{condition} of a @code{for} or @code{while} could just as well be
+replaced with a @code{break} inside an @code{if}:
+
+@example
+@group
+awk '# find smallest divisor of num
+     @{ num = $1
+       for (div = 2; ; div++) @{
+         if (num % div == 0) @{
+           printf "Smallest divisor of %d is %d\n", num, div
+           break
+         @}
+         if (div*div > num) @{
+           printf "%d is prime\n", num
+           break
+         @}
+       @}
+@}'
+@end group
+@end example
+
+@cindex @code{break}, outside of loops
+@cindex historical features
+@cindex @code{awk} language, POSIX version
+@cindex POSIX @code{awk}
+@cindex dark corner
+As described above, the @code{break} statement has no meaning when
+used outside the body of a loop.  However, although it was never documented,
+historical implementations of @code{awk} have treated the @code{break}
+statement outside of a loop as if it were a @code{next} statement
+(@pxref{Next Statement, ,The @code{next} Statement}).
+Recent versions of Unix @code{awk} no longer allow this usage. 
+@code{gawk} will support this use of @code{break} only if @samp{--traditional}
+has been specified on the command line
+(@pxref{Options, ,Command Line Options}).
+Otherwise, it will be treated as an error, since the POSIX standard
+specifies that @code{break} should only be used inside the body of a
+loop (d.c.).
+
+@node Continue Statement, Next Statement, Break Statement, Statements
+@section The @code{continue} Statement
+
+@cindex @code{continue} statement
+The @code{continue} statement, like @code{break}, is used only inside
+@code{for}, @code{while}, and @code{do} loops.  It skips
+over the rest of the loop body, causing the next cycle around the loop
+to begin immediately.  Contrast this with @code{break}, which jumps out
+of the loop altogether.
+
+@c The point of this program was to illustrate the use of continue with
+@c a while loop. But Karl Berry points out that that is done adequately
+@c below, and that this example is very un-awk-like. So for now, we'll
+@c omit it.
+@ignore
+In Texinfo source files, text that the author wishes to ignore can be
+enclosed between lines that start with @samp{@@ignore} and end with
+@samp{@@end ignore}.  Here is a program that strips out lines between
+@samp{@@ignore} and @samp{@@end ignore} pairs.
+
+@example
+BEGIN @{
+    while (getline > 0) @{
+       if (/^@@ignore/)
+           ignoring = 1
+       else if (/^@@end[ \t]+ignore/) @{
+           ignoring = 0
+           continue
+       @}
+       if (ignoring)
+           continue
+       print
+    @}
+@}
+@end example
+
+When an @samp{@@ignore} is seen, the @code{ignoring} flag is set to one (true).
+When @samp{@@end ignore} is seen, the flag is reset to zero (false). As long
+as the flag is true, the input record is not printed, because the
+@code{continue} restarts the @code{while} loop, skipping over the @code{print}
+statement.
+
+@c Exercise!!!
+@c How could this program be written to make better use of the awk language?
+@end ignore
+
+The @code{continue} statement in a @code{for} loop directs @code{awk} to
+skip the rest of the body of the loop, and resume execution with the
+increment-expression of the @code{for} statement.  The following program
+illustrates this fact:
+
+@example
+awk 'BEGIN @{
+     for (x = 0; x <= 20; x++) @{
+         if (x == 5)
+             continue
+         printf "%d ", x
+     @}
+     print ""
+@}'
+@end example
+
+@noindent
+This program prints all the numbers from zero to 20, except for five, for
+which the @code{printf} is skipped.  Since the increment @samp{x++}
+is not skipped, @code{x} does not remain stuck at five.  Contrast the
+@code{for} loop above with this @code{while} loop:
+
+@example
+awk 'BEGIN @{
+     x = 0
+     while (x <= 20) @{
+         if (x == 5)
+             continue
+         printf "%d ", x
+         x++
+     @}
+     print ""
+@}'
+@end example
+
+@noindent
+This program loops forever once @code{x} gets to five.
+
+@cindex @code{continue}, outside of loops
+@cindex historical features
+@cindex @code{awk} language, POSIX version
+@cindex POSIX @code{awk}
+@cindex dark corner
+As described above, the @code{continue} statement has no meaning when
+used outside the body of a loop.  However, although it was never documented,
+historical implementations of @code{awk} have treated the @code{continue}
+statement outside of a loop as if it were a @code{next} statement
+(@pxref{Next Statement, ,The @code{next} Statement}).
+Recent versions of Unix @code{awk} no longer allow this usage. 
+@code{gawk} will support this use of @code{continue} only if
+@samp{--traditional} has been specified on the command line
+(@pxref{Options, ,Command Line Options}).
+Otherwise, it will be treated as an error, since the POSIX standard
+specifies that @code{continue} should only be used inside the body of a
+loop (d.c.).
+
+@node Next Statement, Nextfile Statement, Continue Statement, Statements
+@section The @code{next} Statement
+@cindex @code{next} statement
+
+The @code{next} statement forces @code{awk} to immediately stop processing
+the current record and go on to the next record.  This means that no
+further rules are executed for the current record.  The rest of the
+current rule's action is not executed either.
+
+Contrast this with the effect of the @code{getline} function
+(@pxref{Getline, ,Explicit Input with @code{getline}}).  That too causes
+@code{awk} to read the next record immediately, but it does not alter the
+flow of control in any way.  So the rest of the current action executes
+with a new input record.
+
+At the highest level, @code{awk} program execution is a loop that reads
+an input record and then tests each rule's pattern against it.  If you
+think of this loop as a @code{for} statement whose body contains the
+rules, then the @code{next} statement is analogous to a @code{continue}
+statement: it skips to the end of the body of this implicit loop, and
+executes the increment (which reads another record).
+
+For example, if your @code{awk} program works only on records with four
+fields, and you don't want it to fail when given bad input, you might
+use this rule near the beginning of the program:
+
+@example
+@group
+NF != 4 @{
+  err = sprintf("%s:%d: skipped: NF != 4\n", FILENAME, FNR)
+  print err > "/dev/stderr"
+  next
+@}
+@end group
+@end example
+
+@noindent
+so that the following rules will not see the bad record.  The error
+message is redirected to the standard error output stream, as error
+messages should be.  @xref{Special Files, ,Special File Names in @code{gawk}}.
+
+@cindex @code{awk} language, POSIX version
+@cindex POSIX @code{awk}
+According to the POSIX standard, the behavior is undefined if
+the @code{next} statement is used in a @code{BEGIN} or @code{END} rule.
+@code{gawk} will treat it as a syntax error.
+Although POSIX permits it,
+some other @code{awk} implementations don't allow the @code{next}
+statement inside function bodies
+(@pxref{User-defined, ,User-defined Functions}).
+Just as any other @code{next} statement, a @code{next} inside a
+function body reads the next record and starts processing it with the
+first rule in the program.
+
+If the @code{next} statement causes the end of the input to be reached,
+then the code in any @code{END} rules will be executed.
+@xref{BEGIN/END, ,The @code{BEGIN} and @code{END} Special Patterns}.
+
+@node Nextfile Statement, Exit Statement, Next Statement, Statements
+@section The @code{nextfile} Statement
+@cindex @code{nextfile} statement
+@cindex differences between @code{gawk} and @code{awk}
+
+@code{gawk} provides the @code{nextfile} statement,
+which is similar to the @code{next} statement.
+However, instead of abandoning processing of the current record, the
+@code{nextfile} statement instructs @code{gawk} to stop processing the
+current data file.
+
+Upon execution of the @code{nextfile} statement, @code{FILENAME} is
+updated to the name of the next data file listed on the command line,
+@code{FNR} is reset to one, @code{ARGIND} is incremented, and processing
+starts over with the first rule in the progam.  @xref{Built-in Variables}.
+
+If the @code{nextfile} statement causes the end of the input to be reached,
+then the code in any @code{END} rules will be executed.
+@xref{BEGIN/END, ,The @code{BEGIN} and @code{END} Special Patterns}.
+
+The @code{nextfile} statement is a @code{gawk} extension; it is not
+(currently) available in any other @code{awk} implementation.
+@xref{Nextfile Function, ,Implementing @code{nextfile} as a Function},
+for a user-defined function you can use to simulate the @code{nextfile}
+statement.
+
+The @code{nextfile} statement would be useful if you have many data
+files to process, and you expect that you
+would not want to process every record in every file.
+Normally, in order to move on to
+the next data file, you would have to continue scanning the unwanted
+records.  The @code{nextfile} statement accomplishes this much more
+efficiently.
+
+@cindex @code{next file} statement
+@strong{Caution:}  Versions of @code{gawk} prior to 3.0 used two
+words (@samp{next file}) for the @code{nextfile} statement.  This was
+changed in 3.0 to one word, since the treatment of @samp{file} was
+inconsistent. When it appeared after @code{next}, it was a keyword.
+Otherwise, it was a regular identifier.  The old usage is still
+accepted. However, @code{gawk} will generate a warning message, and
+support for @code{next file} will eventually be discontinued in a
+future version of @code{gawk}.
+
+@node Exit Statement,  , Nextfile Statement, Statements
+@section The @code{exit} Statement
+
+@cindex @code{exit} statement
+The @code{exit} statement causes @code{awk} to immediately stop
+executing the current rule and to stop processing input; any remaining input
+is ignored.  It looks like this:
+
+@example
+exit @r{[}@var{return code}@r{]}
+@end example
+
+If an @code{exit} statement is executed from a @code{BEGIN} rule the
+program stops processing everything immediately.  No input records are
+read.  However, if an @code{END} rule is present, it is executed
+(@pxref{BEGIN/END, ,The @code{BEGIN} and @code{END} Special Patterns}).
+
+If @code{exit} is used as part of an @code{END} rule, it causes
+the program to stop immediately.
+
+An @code{exit} statement that is not part
+of a @code{BEGIN} or @code{END} rule stops the execution of any further
+automatic rules for the current record, skips reading any remaining input
+records, and executes
+the @code{END} rule if there is one.
+
+If you do not want the @code{END} rule to do its job in this case, you
+can set a variable to non-zero before the @code{exit} statement, and check
+that variable in the @code{END} rule.
+@xref{Assert Function, ,Assertions},
+for an example that does this.
+
+@cindex dark corner
+If an argument is supplied to @code{exit}, its value is used as the exit
+status code for the @code{awk} process.  If no argument is supplied,
+@code{exit} returns status zero (success).  In the case where an argument
+is supplied to a first @code{exit} statement, and then @code{exit} is
+called a second time with no argument, the previously supplied exit value
+is used (d.c.).
+
+For example, let's say you've discovered an error condition you really
+don't know how to handle.  Conventionally, programs report this by
+exiting with a non-zero status.  Your @code{awk} program can do this
+using an @code{exit} statement with a non-zero argument.  Here is an
+example:
+
+@example
+@group
+BEGIN @{
+       if (("date" | getline date_now) < 0) @{
+         print "Can't get system date" > "/dev/stderr"
+         exit 1
+       @}
+       print "current date is", date_now
+       close("date")
+@}
+@end group
+@end example
+
+@node Built-in Variables, Arrays, Statements, Top
+@chapter Built-in Variables
+@cindex built-in variables
+
+Most @code{awk} variables are available for you to use for your own
+purposes; they never change except when your program assigns values to
+them, and never affect anything except when your program examines them.
+However, a few variables in @code{awk} have special built-in meanings.
+Some of them @code{awk} examines automatically, so that they enable you
+to tell @code{awk} how to do certain things.  Others are set
+automatically by @code{awk}, so that they carry information from the
+internal workings of @code{awk} to your program.
+
+This chapter documents all the built-in variables of @code{gawk}.  Most
+of them are also documented in the chapters describing their areas of
+activity.
+
+@menu
+* User-modified::               Built-in variables that you change to control
+                                @code{awk}.
+* Auto-set::                    Built-in variables where @code{awk} gives you
+                                information.
+* ARGC and ARGV::               Ways to use @code{ARGC} and @code{ARGV}.
+@end menu
+
+@node User-modified, Auto-set, Built-in Variables, Built-in Variables
+@section Built-in Variables that Control @code{awk}
+@cindex built-in variables, user modifiable
+
+This is an alphabetical list of the variables which you can change to
+control how @code{awk} does certain things. Those variables that are
+specific to @code{gawk} are marked with an asterisk, @samp{*}.
+
+@table @code
+@vindex CONVFMT
+@cindex @code{awk} language, POSIX version
+@cindex POSIX @code{awk}
+@item CONVFMT
+This string controls conversion of numbers to
+strings (@pxref{Conversion, ,Conversion of Strings and Numbers}).
+It works by being passed, in effect, as the first argument to the
+@code{sprintf} function
+(@pxref{String Functions, ,Built-in Functions for String Manipulation}).
+Its default value is @code{"%.6g"}.
+@code{CONVFMT} was introduced by the POSIX standard.
+
+@vindex FIELDWIDTHS
+@item FIELDWIDTHS *
+This is a space separated list of columns that tells @code{gawk}
+how to split input with fixed, columnar boundaries.  It is an
+experimental feature.  Assigning to @code{FIELDWIDTHS}
+overrides the use of @code{FS} for field splitting.
+@xref{Constant Size, ,Reading Fixed-width Data}, for more information.
+
+If @code{gawk} is in compatibility mode
+(@pxref{Options, ,Command Line Options}), then @code{FIELDWIDTHS}
+has no special meaning, and field splitting operations are done based
+exclusively on the value of @code{FS}.
+
+@vindex FS
+@item FS
+@code{FS} is the input field separator
+(@pxref{Field Separators, ,Specifying How Fields are Separated}).
+The value is a single-character string or a multi-character regular
+expression that matches the separations between fields in an input
+record.  If the value is the null string (@code{""}), then each
+character in the record becomes a separate field.
+
+The default value is @w{@code{" "}}, a string consisting of a single
+space.  As a special exception, this value means that any
+sequence of spaces and tabs is a single separator.  It also causes
+spaces and tabs at the beginning and end of a record to be ignored.
+
+You can set the value of @code{FS} on the command line using the
+@samp{-F} option:
+
+@example
+awk -F, '@var{program}' @var{input-files}
+@end example
+
+If @code{gawk} is using @code{FIELDWIDTHS} for field-splitting,
+assigning a value to @code{FS} will cause @code{gawk} to return to
+the normal, @code{FS}-based, field splitting. An easy way to do this
+is to simply say @samp{FS = FS}, perhaps with an explanatory comment.
+
+@vindex IGNORECASE
+@item IGNORECASE *
+If @code{IGNORECASE} is non-zero or non-null, then all string comparisons,
+and all regular expression matching are case-independent.  Thus, regexp
+matching with @samp{~} and @samp{!~}, and the @code{gensub},
+@code{gsub}, @code{index}, @code{match}, @code{split} and @code{sub}
+functions, record termination with @code{RS}, and field splitting with
+@code{FS} all ignore case when doing their particular regexp operations.
+@xref{Case-sensitivity, ,Case-sensitivity in Matching}.
+
+If @code{gawk} is in compatibility mode
+(@pxref{Options, ,Command Line Options}),
+then @code{IGNORECASE} has no special meaning, and string
+and regexp operations are always case-sensitive.
+
+@vindex OFMT
+@item OFMT
+This string controls conversion of numbers to
+strings (@pxref{Conversion, ,Conversion of Strings and Numbers}) for
+printing with the @code{print} statement.  It works by being passed, in
+effect, as the first argument to the @code{sprintf} function
+(@pxref{String Functions, ,Built-in Functions for String Manipulation}).
+Its default value is @code{"%.6g"}.  Earlier versions of @code{awk}
+also used @code{OFMT} to specify the format for converting numbers to
+strings in general expressions; this is now done by @code{CONVFMT}.
+
+@vindex OFS
+@item OFS
+This is the output field separator (@pxref{Output Separators}).  It is
+output between the fields output by a @code{print} statement.  Its
+default value is @w{@code{" "}}, a string consisting of a single space.
+
+@vindex ORS
+@item ORS
+This is the output record separator.  It is output at the end of every
+@code{print} statement.  Its default value is @code{"\n"}.
+(@xref{Output Separators}.)
+
+@vindex RS
+@item RS
+This is @code{awk}'s input record separator.  Its default value is a string
+containing a single newline character, which means that an input record
+consists of a single line of text.
+It can also be the null string, in which case records are separated by
+runs of blank lines, or a regexp, in which case records are separated by
+matches of the regexp in the input text.
+(@xref{Records, ,How Input is Split into Records}.)
+
+@vindex SUBSEP
+@item SUBSEP
+@code{SUBSEP} is the subscript separator.  It has the default value of
+@code{"\034"}, and is used to separate the parts of the indices of a
+multi-dimensional array.  Thus, the expression @code{@w{foo["A", "B"]}}
+really accesses @code{foo["A\034B"]}
+(@pxref{Multi-dimensional, ,Multi-dimensional Arrays}).
+@end table
+
+@node Auto-set, ARGC and ARGV, User-modified, Built-in Variables
+@section Built-in Variables that Convey Information
+@cindex built-in variables, convey information
+
+This is an alphabetical list of the variables that are set
+automatically by @code{awk} on certain occasions in order to provide
+information to your program.  Those variables that are specific to
+@code{gawk} are marked with an asterisk, @samp{*}.
+
+@table @code
+@vindex ARGC
+@vindex ARGV
+@item ARGC
+@itemx ARGV
+The command-line arguments available to @code{awk} programs are stored in
+an array called @code{ARGV}.  @code{ARGC} is the number of command-line
+arguments present.  @xref{Other Arguments, ,Other Command Line Arguments}.
+Unlike most @code{awk} arrays,
+@code{ARGV} is indexed from zero to @code{ARGC} @minus{} 1.  For example:
+
+@example
+@group
+$ awk 'BEGIN @{
+>        for (i = 0; i < ARGC; i++) 
+>            print ARGV[i] 
+>      @}' inventory-shipped BBS-list
+@print{} awk
+@print{} inventory-shipped
+@print{} BBS-list
+@end group
+@end example
+
+@noindent
+In this example, @code{ARGV[0]} contains @code{"awk"}, @code{ARGV[1]}
+contains @code{"inventory-shipped"}, and @code{ARGV[2]} contains
+@code{"BBS-list"}.  The value of @code{ARGC} is three, one more than the
+index of the last element in @code{ARGV}, since the elements are numbered
+from zero.
+
+The names @code{ARGC} and @code{ARGV}, as well as the convention of indexing
+the array from zero to @code{ARGC} @minus{} 1, are derived from the C language's
+method of accessing command line arguments.
+@xref{ARGC and ARGV, , Using @code{ARGC} and @code{ARGV}}, for information
+about how @code{awk} uses these variables.
+
+@vindex ARGIND
+@item ARGIND *
+The index in @code{ARGV} of the current file being processed.
+Every time @code{gawk} opens a new data file for processing, it sets
+@code{ARGIND} to the index in @code{ARGV} of the file name.
+When @code{gawk} is processing the input files, it is always
+true that @samp{FILENAME == ARGV[ARGIND]}.
+
+This variable is useful in file processing; it allows you to tell how far
+along you are in the list of data files, and to distinguish between
+successive instances of the same filename on the command line.
+
+While you can change the value of @code{ARGIND} within your @code{awk}
+program, @code{gawk} will automatically set it to a new value when the
+next file is opened.
+
+This variable is a @code{gawk} extension. In other @code{awk} implementations,
+or if @code{gawk} is in compatibility mode
+(@pxref{Options, ,Command Line Options}),
+it is not special.
+
+@vindex ENVIRON
+@item ENVIRON
+An associative array that contains the values of the environment.  The array
+indices are the environment variable names; the values are the values of
+the particular environment variables.  For example,
+@code{ENVIRON["HOME"]} might be @file{/home/arnold}.  Changing this array
+does not affect the environment passed on to any programs that
+@code{awk} may spawn via redirection or the @code{system} function.
+(In a future version of @code{gawk}, it may do so.)
+
+Some operating systems may not have environment variables.
+On such systems, the @code{ENVIRON} array is empty (except for
+@w{@code{ENVIRON["AWKPATH"]}}).
+
+@vindex ERRNO
+@item ERRNO *
+If a system error occurs either doing a redirection for @code{getline},
+during a read for @code{getline}, or during a @code{close} operation,
+then @code{ERRNO} will contain a string describing the error.
+
+This variable is a @code{gawk} extension. In other @code{awk} implementations,
+or if @code{gawk} is in compatibility mode
+(@pxref{Options, ,Command Line Options}),
+it is not special.
+
+@cindex dark corner
+@vindex FILENAME
+@item FILENAME
+This is the name of the file that @code{awk} is currently reading.
+When no data files are listed on the command line, @code{awk} reads
+from the standard input, and @code{FILENAME} is set to @code{"-"}.
+@code{FILENAME} is changed each time a new file is read
+(@pxref{Reading Files, ,Reading Input Files}).
+Inside a @code{BEGIN} rule, the value of @code{FILENAME} is
+@code{""}, since there are no input files being processed
+yet.@footnote{Some early implementations of Unix @code{awk} initialized
+@code{FILENAME} to @code{"-"}, even if there were data files to be
+processed. This behavior was incorrect, and should not be relied
+upon in your programs.} (d.c.)
+
+@vindex FNR
+@item FNR
+@code{FNR} is the current record number in the current file.  @code{FNR} is
+incremented each time a new record is read
+(@pxref{Getline, ,Explicit Input with @code{getline}}).  It is reinitialized
+to zero each time a new input file is started.
+
+@vindex NF
+@item NF
+@code{NF} is the number of fields in the current input record.
+@code{NF} is set each time a new record is read, when a new field is
+created, or when @code{$0} changes (@pxref{Fields, ,Examining Fields}).
+
+@vindex NR
+@item NR
+This is the number of input records @code{awk} has processed since
+the beginning of the program's execution
+(@pxref{Records, ,How Input is Split into Records}).
+@code{NR} is set each time a new record is read.
+
+@vindex RLENGTH
+@item RLENGTH
+@code{RLENGTH} is the length of the substring matched by the
+@code{match} function
+(@pxref{String Functions, ,Built-in Functions for String Manipulation}).
+@code{RLENGTH} is set by invoking the @code{match} function.  Its value
+is the length of the matched string, or @minus{}1 if no match was found.
+
+@vindex RSTART
+@item RSTART
+@code{RSTART} is the start-index in characters of the substring matched by the
+@code{match} function
+(@pxref{String Functions, ,Built-in Functions for String Manipulation}).
+@code{RSTART} is set by invoking the @code{match} function.  Its value
+is the position of the string where the matched substring starts, or zero
+if no match was found.
+
+@vindex RT
+@item RT *
+@code{RT} is set each time a record is read. It contains the input text
+that matched the text denoted by @code{RS}, the record separator.
+
+This variable is a @code{gawk} extension. In other @code{awk} implementations,
+or if @code{gawk} is in compatibility mode
+(@pxref{Options, ,Command Line Options}),
+it is not special.
+@end table
+
+@cindex dark corner
+A side note about @code{NR} and @code{FNR}.
+@code{awk} simply increments both of these variables
+each time it reads a record, instead of setting them to the absolute
+value of the number of records read.  This means that your program can
+change these variables, and their new values will be incremented for
+each record (d.c.).  For example:
+
+@example
+@group
+$ echo '1
+> 2
+> 3
+> 4' | awk 'NR == 2 @{ NR = 17 @}
+> @{ print NR @}'
+@print{} 1
+@print{} 17
+@print{} 18
+@print{} 19
+@end group
+@end example
+
+@noindent
+Before @code{FNR} was added to the @code{awk} language
+(@pxref{V7/SVR3.1, ,Major Changes between V7 and SVR3.1}),
+many @code{awk} programs used this feature to track the number of
+records in a file by resetting @code{NR} to zero when @code{FILENAME}
+changed.
+
+@node ARGC and ARGV, , Auto-set, Built-in Variables
+@section Using @code{ARGC} and @code{ARGV}
+
+In @ref{Auto-set,  ,  Built-in Variables that Convey Information},
+you saw this program describing the information contained in @code{ARGC}
+and @code{ARGV}:
+
+@example
+@group
+$ awk 'BEGIN @{
+>        for (i = 0; i < ARGC; i++) 
+>            print ARGV[i] 
+>      @}' inventory-shipped BBS-list
+@print{} awk
+@print{} inventory-shipped
+@print{} BBS-list
+@end group
+@end example
+
+@noindent
+In this example, @code{ARGV[0]} contains @code{"awk"}, @code{ARGV[1]}
+contains @code{"inventory-shipped"}, and @code{ARGV[2]} contains
+@code{"BBS-list"}.
+
+Notice that the @code{awk} program is not entered in @code{ARGV}.  The
+other special command line options, with their arguments, are also not
+entered.  But variable assignments on the command line @emph{are}
+treated as arguments, and do show up in the @code{ARGV} array.
+
+Your program can alter @code{ARGC} and the elements of @code{ARGV}.
+Each time @code{awk} reaches the end of an input file, it uses the next
+element of @code{ARGV} as the name of the next input file.  By storing a
+different string there, your program can change which files are read.
+You can use @code{"-"} to represent the standard input.  By storing
+additional elements and incrementing @code{ARGC} you can cause
+additional files to be read.
+
+If you decrease the value of @code{ARGC}, that eliminates input files
+from the end of the list.  By recording the old value of @code{ARGC}
+elsewhere, your program can treat the eliminated arguments as
+something other than file names.
+
+To eliminate a file from the middle of the list, store the null string
+(@code{""}) into @code{ARGV} in place of the file's name.  As a
+special feature, @code{awk} ignores file names that have been
+replaced with the null string.
+You may also use the @code{delete} statement to remove elements from
+@code{ARGV} (@pxref{Delete, ,The @code{delete} Statement}).
+
+All of these actions are typically done from the @code{BEGIN} rule,
+before actual processing of the input begins.
+@xref{Split Program, ,Splitting a Large File Into Pieces}, and see
+@ref{Tee Program, ,Duplicating Output Into Multiple Files}, for an example
+of each way of removing elements from @code{ARGV}.
+
+The following fragment processes @code{ARGV} in order to examine, and
+then remove, command line options.
+
+@example
+@group
+BEGIN @{
+    for (i = 1; i < ARGC; i++) @{
+        if (ARGV[i] == "-v")
+            verbose = 1
+        else if (ARGV[i] == "-d")
+            debug = 1
+@end group
+@group
+        else if (ARGV[i] ~ /^-?/) @{
+            e = sprintf("%s: unrecognized option -- %c",
+                    ARGV[0], substr(ARGV[i], 1, ,1))
+            print e > "/dev/stderr"
+        @} else
+            break
+        delete ARGV[i]
+    @}
+@}
+@end group
+@end example
+
+@node Arrays, Built-in, Built-in Variables, Top
+@chapter Arrays in @code{awk}
+
+An @dfn{array} is a table of values, called @dfn{elements}.  The
+elements of an array are distinguished by their indices.  @dfn{Indices}
+may be either numbers or strings.  @code{awk} maintains a single set
+of names that may be used for naming variables, arrays and functions
+(@pxref{User-defined, ,User-defined Functions}).
+Thus, you cannot have a variable and an array with the same name in the
+same @code{awk} program.
+
+@menu
+* Array Intro::                 Introduction to Arrays
+* Reference to Elements::       How to examine one element of an array.
+* Assigning Elements::          How to change an element of an array.
+* Array Example::               Basic Example of an Array
+* Scanning an Array::           A variation of the @code{for} statement. It
+                                loops through the indices of an array's
+                                existing elements.
+* Delete::                      The @code{delete} statement removes an element
+                                from an array.
+* Numeric Array Subscripts::    How to use numbers as subscripts in
+                                @code{awk}.
+* Uninitialized Subscripts::    Using Uninitialized variables as subscripts.
+* Multi-dimensional::           Emulating multi-dimensional arrays in
+                                @code{awk}.
+* Multi-scanning::              Scanning multi-dimensional arrays.
+@end menu
+
+@node Array Intro, Reference to Elements, Arrays, Arrays
+@section Introduction to Arrays
+
+@cindex arrays
+The @code{awk} language provides one-dimensional @dfn{arrays} for storing groups
+of related strings or numbers.
+
+Every @code{awk} array must have a name.  Array names have the same
+syntax as variable names; any valid variable name would also be a valid
+array name.  But you cannot use one name in both ways (as an array and
+as a variable) in one @code{awk} program.
+
+Arrays in @code{awk} superficially resemble arrays in other programming
+languages; but there are fundamental differences.  In @code{awk}, you
+don't need to specify the size of an array before you start to use it.
+Additionally, any number or string in @code{awk} may be used as an
+array index, not just consecutive integers.
+
+In most other languages, you have to @dfn{declare} an array and specify
+how many elements or components it contains.  In such languages, the
+declaration causes a contiguous block of memory to be allocated for that
+many elements.  An index in the array usually must be a positive integer; for
+example, the index zero specifies the first element in the array, which is
+actually stored at the beginning of the block of memory.  Index one
+specifies the second element, which is stored in memory right after the
+first element, and so on.  It is impossible to add more elements to the
+array, because it has room for only as many elements as you declared.
+(Some languages allow arbitrary starting and ending indices,
+e.g., @samp{15 .. 27}, but the size of the array is still fixed when
+the array is declared.)
+
+A contiguous array of four elements might look like this,
+conceptually, if the element values are eight, @code{"foo"},
+@code{""} and 30:
+
+@iftex
+@c from Karl Berry, much thanks for the help.
+@tex
+\bigskip % space above the table (about 1 linespace)
+\offinterlineskip
+\newdimen\width \width = 1.5cm
+\newdimen\hwidth \hwidth = 4\width \advance\hwidth by 2pt % 5 * 0.4pt
+\centerline{\vbox{
+\halign{\strut\hfil\ignorespaces#&&\vrule#&\hbox to\width{\hfil#\unskip\hfil}\cr
+\noalign{\hrule width\hwidth}
+	&&{\tt 8} &&{\tt "foo"} &&{\tt ""} &&{\tt 30} &&\quad value\cr
+\noalign{\hrule width\hwidth}
+\noalign{\smallskip}
+	&\omit&0&\omit &1   &\omit&2 &\omit&3 &\omit&\quad index\cr
+}
+}}
+@end tex
+@end iftex
+@ifinfo
+@example
++---------+---------+--------+---------+
+|    8    |  "foo"  |   ""   |    30   |    @r{value}
++---------+---------+--------+---------+
+     0         1         2         3        @r{index}
+@end example
+@end ifinfo
+
+@noindent
+Only the values are stored; the indices are implicit from the order of
+the values.  Eight is the value at index zero, because eight appears in the
+position with zero elements before it.
+
+@cindex arrays, definition of
+@cindex associative arrays
+@cindex arrays, associative
+Arrays in @code{awk} are different: they are @dfn{associative}.  This means
+that each array is a collection of pairs: an index, and its corresponding
+array element value:
+
+@example
+@r{Element} 4     @r{Value} 30
+@r{Element} 2     @r{Value} "foo"
+@r{Element} 1     @r{Value} 8
+@r{Element} 3     @r{Value} ""
+@end example
+
+@noindent
+We have shown the pairs in jumbled order because their order is irrelevant.
+
+One advantage of associative arrays is that new pairs can be added
+at any time.  For example, suppose we add to the above array a tenth element
+whose value is @w{@code{"number ten"}}.  The result is this:
+
+@example
+@r{Element} 10    @r{Value} "number ten"
+@r{Element} 4     @r{Value} 30
+@r{Element} 2     @r{Value} "foo"
+@r{Element} 1     @r{Value} 8
+@r{Element} 3     @r{Value} ""
+@end example
+
+@noindent
+@cindex sparse arrays
+@cindex arrays, sparse
+Now the array is @dfn{sparse}, which just means some indices are missing:
+it has elements 1--4 and 10, but doesn't have elements 5, 6, 7, 8, or 9.
+@c ok, I should spell out the above, but ...
+
+Another consequence of associative arrays is that the indices don't
+have to be positive integers.  Any number, or even a string, can be
+an index.  For example, here is an array which translates words from
+English into French:
+
+@example
+@r{Element} "dog" @r{Value} "chien"
+@r{Element} "cat" @r{Value} "chat"
+@r{Element} "one" @r{Value} "un"
+@r{Element} 1     @r{Value} "un"
+@end example
+
+@noindent
+Here we decided to translate the number one in both spelled-out and
+numeric form---thus illustrating that a single array can have both
+numbers and strings as indices.
+(In fact, array subscripts are always strings; this is discussed
+in more detail in
+@ref{Numeric Array Subscripts, ,Using Numbers to Subscript Arrays}.)
+
+When @code{awk} creates an array for you, e.g., with the @code{split}
+built-in function,
+that array's indices are consecutive integers starting at one.
+(@xref{String Functions, ,Built-in Functions for String Manipulation}.)
+
+@node Reference to Elements, Assigning Elements, Array Intro, Arrays
+@section Referring to an Array Element
+@cindex array reference
+@cindex element of array
+@cindex reference to array
+
+The principal way of using an array is to refer to one of its elements.
+An array reference is an expression which looks like this:
+
+@example
+@var{array}[@var{index}]
+@end example
+
+@noindent
+Here, @var{array} is the name of an array.  The expression @var{index} is
+the index of the element of the array that you want.
+
+The value of the array reference is the current value of that array
+element.  For example, @code{foo[4.3]} is an expression for the element
+of array @code{foo} at index @samp{4.3}.
+
+If you refer to an array element that has no recorded value, the value
+of the reference is @code{""}, the null string.  This includes elements
+to which you have not assigned any value, and elements that have been
+deleted (@pxref{Delete, ,The @code{delete} Statement}).  Such a reference
+automatically creates that array element, with the null string as its value.
+(In some cases, this is unfortunate, because it might waste memory inside
+@code{awk}.)
+
+@cindex arrays, presence of elements
+@cindex arrays, the @code{in} operator
+You can find out if an element exists in an array at a certain index with
+the expression:
+
+@example
+@var{index} in @var{array}
+@end example
+
+@noindent
+This expression tests whether or not the particular index exists,
+without the side effect of creating that element if it is not present.
+The expression has the value one (true) if @code{@var{array}[@var{index}]}
+exists, and zero (false) if it does not exist.
+
+For example, to test whether the array @code{frequencies} contains the
+index @samp{2}, you could write this statement:
+
+@example
+if (2 in frequencies)
+    print "Subscript 2 is present."
+@end example
+
+Note that this is @emph{not} a test of whether or not the array
+@code{frequencies} contains an element whose @emph{value} is two.
+(There is no way to do that except to scan all the elements.)  Also, this
+@emph{does not} create @code{frequencies[2]}, while the following
+(incorrect) alternative would do so:
+
+@example
+if (frequencies[2] != "")
+    print "Subscript 2 is present."
+@end example
+
+@node Assigning Elements, Array Example, Reference to Elements, Arrays
+@section Assigning Array Elements
+@cindex array assignment
+@cindex element assignment
+
+Array elements are lvalues: they can be assigned values just like
+@code{awk} variables:
+
+@example
+@var{array}[@var{subscript}] = @var{value}
+@end example
+
+@noindent
+Here @var{array} is the name of your array.  The expression
+@var{subscript} is the index of the element of the array that you want
+to assign a value.  The expression @var{value} is the value you are
+assigning to that element of the array.
+
+@node Array Example, Scanning an Array, Assigning Elements, Arrays
+@section Basic Array Example
+
+The following program takes a list of lines, each beginning with a line
+number, and prints them out in order of line number.  The line numbers are
+not in order, however, when they are first read:  they are scrambled.  This
+program sorts the lines by making an array using the line numbers as
+subscripts.  It then prints out the lines in sorted order of their numbers.
+It is a very simple program, and gets confused if it encounters repeated
+numbers, gaps, or lines that don't begin with a number.
+
+@example
+@c file eg/misc/arraymax.awk
+@{
+  if ($1 > max)
+    max = $1
+  arr[$1] = $0
+@}
+
+END @{
+  for (x = 1; x <= max; x++)
+    print arr[x]
+@}
+@c endfile
+@end example
+
+The first rule keeps track of the largest line number seen so far;
+it also stores each line into the array @code{arr}, at an index that
+is the line's number.
+
+The second rule runs after all the input has been read, to print out
+all the lines.
+
+When this program is run with the following input:
+
+@example
+@group
+@c file eg/misc/arraymax.data
+5  I am the Five man
+2  Who are you?  The new number two!
+4  . . . And four on the floor
+1  Who is number one?
+3  I three you.
+@c endfile
+@end group
+@end example
+
+@noindent
+its output is this:
+
+@example
+1  Who is number one?
+2  Who are you?  The new number two!
+3  I three you.
+4  . . . And four on the floor
+5  I am the Five man
+@end example
+
+If a line number is repeated, the last line with a given number overrides
+the others.
+
+Gaps in the line numbers can be handled with an easy improvement to the
+program's @code{END} rule:
+
+@example
+END @{
+  for (x = 1; x <= max; x++)
+    if (x in arr)
+      print arr[x]
+@}
+@end example
+
+@node Scanning an Array, Delete, Array Example, Arrays
+@section Scanning All Elements of an Array
+@cindex @code{for (x in @dots{})}
+@cindex arrays, special @code{for} statement
+@cindex scanning an array
+
+In programs that use arrays, you often need a loop that executes
+once for each element of an array.  In other languages, where arrays are
+contiguous and indices are limited to positive integers, this is
+easy: you can
+find all the valid indices by counting from the lowest index
+up to the highest.  This
+technique won't do the job in @code{awk}, since any number or string
+can be an array index.  So @code{awk} has a special kind of @code{for}
+statement for scanning an array:
+
+@example
+for (@var{var} in @var{array})
+  @var{body}
+@end example
+
+@noindent
+This loop executes @var{body} once for each index in @var{array} that your
+program has previously used, with the
+variable @var{var} set to that index.
+
+Here is a program that uses this form of the @code{for} statement.  The
+first rule scans the input records and notes which words appear (at
+least once) in the input, by storing a one into the array @code{used} with
+the word as index.  The second rule scans the elements of @code{used} to
+find all the distinct words that appear in the input.  It prints each
+word that is more than 10 characters long, and also prints the number of
+such words.  @xref{String Functions, ,Built-in Functions for String Manipulation}, for more information
+on the built-in function @code{length}.
+
+@example
+# Record a 1 for each word that is used at least once.
+@{
+    for (i = 1; i <= NF; i++)
+        used[$i] = 1
+@}
+
+# Find number of distinct words more than 10 characters long.
+END @{
+    for (x in used)
+        if (length(x) > 10) @{
+            ++num_long_words
+            print x
+        @}
+    print num_long_words, "words longer than 10 characters"
+@}
+@end example
+
+@noindent
+@xref{Word Sorting, ,Generating Word Usage Counts},
+for a more detailed example of this type.
+
+The order in which elements of the array are accessed by this statement
+is determined by the internal arrangement of the array elements within
+@code{awk} and cannot be controlled or changed.  This can lead to
+problems if new elements are added to @var{array} by statements in
+the loop body; you cannot predict whether or not the @code{for} loop will
+reach them.  Similarly, changing @var{var} inside the loop may produce
+strange results.  It is best to avoid such things.
+
+@node Delete, Numeric Array Subscripts, Scanning an Array, Arrays
+@section The @code{delete} Statement
+@cindex @code{delete} statement
+@cindex deleting elements of arrays
+@cindex removing elements of arrays
+@cindex arrays, deleting an element
+
+You can remove an individual element of an array using the @code{delete}
+statement:
+
+@example
+delete @var{array}[@var{index}]
+@end example
+
+Once you have deleted an array element, you can no longer obtain any
+value the element once had.  It is as if you had never referred
+to it and had never given it any value.
+
+Here is an example of deleting elements in an array:
+
+@example
+for (i in frequencies)
+  delete frequencies[i]
+@end example
+
+@noindent
+This example removes all the elements from the array @code{frequencies}.
+
+If you delete an element, a subsequent @code{for} statement to scan the array
+will not report that element, and the @code{in} operator to check for
+the presence of that element will return zero (i.e.@: false):
+
+@example
+delete foo[4]
+if (4 in foo)
+    print "This will never be printed"
+@end example
+
+It is important to note that deleting an element is @emph{not} the
+same as assigning it a null value (the empty string, @code{""}).
+
+@example
+foo[4] = ""
+if (4 in foo)
+  print "This is printed, even though foo[4] is empty"
+@end example
+
+It is not an error to delete an element that does not exist.
+
+@cindex arrays, deleting entire contents
+@cindex deleting entire arrays
+@cindex differences between @code{gawk} and @code{awk}
+You can delete all the elements of an array with a single statement,
+by leaving off the subscript in the @code{delete} statement.
+
+@example
+delete @var{array}
+@end example
+
+This ability is a @code{gawk} extension; it is not available in
+compatibility mode (@pxref{Options, ,Command Line Options}).
+
+Using this version of the @code{delete} statement is about three times
+more efficient than the equivalent loop that deletes each element one
+at a time.
+
+@cindex portability issues
+The following statement provides a portable, but non-obvious way to clear
+out an array.
+
+@cindex Brennan, Michael
+@example
+@group
+# thanks to Michael Brennan for pointing this out
+split("", array)
+@end group
+@end example
+
+The @code{split} function
+(@pxref{String Functions, ,Built-in Functions for String Manipulation})
+clears out the target array first. This call asks it to split
+apart the null string. Since there is no data to split out, the
+function simply clears the array and then returns.
+
+@node Numeric Array Subscripts, Uninitialized Subscripts, Delete, Arrays
+@section Using Numbers to Subscript Arrays
+
+An important aspect of arrays to remember is that @emph{array subscripts
+are always strings}.  If you use a numeric value as a subscript,
+it will be converted to a string value before it is used for subscripting
+(@pxref{Conversion, ,Conversion of Strings and Numbers}).
+
+@cindex conversions, during subscripting
+@cindex numbers, used as subscripts
+@vindex CONVFMT
+This means that the value of the built-in variable @code{CONVFMT} can potentially
+affect how your program accesses elements of an array.  For example:
+
+@example
+xyz = 12.153
+data[xyz] = 1
+CONVFMT = "%2.2f"
+@group
+if (xyz in data)
+    printf "%s is in data\n", xyz
+else
+    printf "%s is not in data\n", xyz
+@end group
+@end example
+
+@noindent
+This prints @samp{12.15 is not in data}.  The first statement gives
+@code{xyz} a numeric value.  Assigning to
+@code{data[xyz]} subscripts @code{data} with the string value @code{"12.153"}
+(using the default conversion value of @code{CONVFMT}, @code{"%.6g"}),
+and assigns one to @code{data["12.153"]}.  The program then changes
+the value of @code{CONVFMT}.  The test @samp{(xyz in data)} generates a new
+string value from @code{xyz}, this time @code{"12.15"}, since the value of
+@code{CONVFMT} only allows two significant digits.  This test fails,
+since @code{"12.15"} is a different string from @code{"12.153"}.
+
+According to the rules for conversions
+(@pxref{Conversion, ,Conversion of Strings and Numbers}), integer
+values are always converted to strings as integers, no matter what the
+value of @code{CONVFMT} may happen to be.  So the usual case of:
+
+@example
+for (i = 1; i <= maxsub; i++)
+    @i{do something with} array[i]
+@end example
+
+@noindent
+will work, no matter what the value of @code{CONVFMT}.
+
+Like many things in @code{awk}, the majority of the time things work
+as you would expect them to work.  But it is useful to have a precise
+knowledge of the actual rules, since sometimes they can have a subtle
+effect on your programs.
+
+@node Uninitialized Subscripts, Multi-dimensional, Numeric Array Subscripts, Arrays
+@section Using Uninitialized Variables as Subscripts
+
+@cindex uninitialized variables, as array subscripts
+@cindex array subscripts, uninitialized variables
+Suppose you want to print your input data in reverse order.
+A reasonable attempt at a program to do so (with some test
+data) might look like this:
+
+@example
+$ echo 'line 1
+> line 2
+> line 3' | awk '@{ l[lines] = $0; ++lines @}
+> END @{
+>     for (i = lines-1; i >= 0; --i)
+>        print l[i]
+> @}'
+@print{} line 3
+@print{} line 2
+@end example
+
+Unfortunately, the very first line of input data did not come out in the
+output!
+
+At first glance, this program should have worked.  The variable @code{lines}
+is uninitialized, and uninitialized variables have the numeric value zero.
+So, the value of @code{l[0]} should have been printed.
+
+The issue here is that subscripts for @code{awk} arrays are @strong{always}
+strings. And uninitialized variables, when used as strings, have the
+value @code{""}, not zero.  Thus, @samp{line 1} ended up stored in
+@code{l[""]}.
+
+The following version of the program works correctly:
+
+@example
+@{ l[lines++] = $0 @}
+END @{
+    for (i = lines - 1; i >= 0; --i)
+       print l[i]
+@}
+@end example
+
+Here, the @samp{++} forces @code{l} to be numeric, thus making
+the ``old value'' numeric zero, which is then converted to @code{"0"}
+as the array subscript.
+
+@cindex null string, as array subscript
+@cindex dark corner
+As we have just seen, even though it is somewhat unusual, the null string
+(@code{""}) is a valid array subscript (d.c.). If @samp{--lint} is provided
+on the command line (@pxref{Options, ,Command Line Options}),
+@code{gawk} will warn about the use of the null string as a subscript.
+
+@node Multi-dimensional, Multi-scanning, Uninitialized Subscripts, Arrays
+@section Multi-dimensional Arrays
+
+@cindex subscripts in arrays
+@cindex arrays, multi-dimensional subscripts
+@cindex multi-dimensional subscripts
+A multi-dimensional array is an array in which an element is identified
+by a sequence of indices, instead of a single index.  For example, a
+two-dimensional array requires two indices.  The usual way (in most
+languages, including @code{awk}) to refer to an element of a
+two-dimensional array named @code{grid} is with
+@code{grid[@var{x},@var{y}]}.
+
+@vindex SUBSEP
+Multi-dimensional arrays are supported in @code{awk} through
+concatenation of indices into one string.  What happens is that
+@code{awk} converts the indices into strings
+(@pxref{Conversion, ,Conversion of Strings and Numbers}) and
+concatenates them together, with a separator between them.  This creates
+a single string that describes the values of the separate indices.  The
+combined string is used as a single index into an ordinary,
+one-dimensional array.  The separator used is the value of the built-in
+variable @code{SUBSEP}.
+
+For example, suppose we evaluate the expression @samp{foo[5,12] = "value"}
+when the value of @code{SUBSEP} is @code{"@@"}.  The numbers five and 12 are
+converted to strings and
+concatenated with an @samp{@@} between them, yielding @code{"5@@12"}; thus,
+the array element @code{foo["5@@12"]} is set to @code{"value"}.
+
+Once the element's value is stored, @code{awk} has no record of whether
+it was stored with a single index or a sequence of indices.  The two
+expressions @samp{foo[5,12]} and @w{@samp{foo[5 SUBSEP 12]}} are always
+equivalent.
+
+The default value of @code{SUBSEP} is the string @code{"\034"},
+which contains a non-printing character that is unlikely to appear in an
+@code{awk} program or in most input data.
+
+The usefulness of choosing an unlikely character comes from the fact
+that index values that contain a string matching @code{SUBSEP} lead to
+combined strings that are ambiguous.  Suppose that @code{SUBSEP} were
+@code{"@@"}; then @w{@samp{foo["a@@b", "c"]}} and @w{@samp{foo["a",
+"b@@c"]}} would be indistinguishable because both would actually be
+stored as @samp{foo["a@@b@@c"]}.
+
+You can test whether a particular index-sequence exists in a
+``multi-dimensional'' array with the same operator @samp{in} used for single
+dimensional arrays.  Instead of a single index as the left-hand operand,
+write the whole sequence of indices, separated by commas, in
+parentheses:
+
+@example
+(@var{subscript1}, @var{subscript2}, @dots{}) in @var{array}
+@end example
+
+The following example treats its input as a two-dimensional array of
+fields; it rotates this array 90 degrees clockwise and prints the
+result.  It assumes that all lines have the same number of
+elements.
+
+@example
+@group
+awk '@{
+     if (max_nf < NF)
+          max_nf = NF
+     max_nr = NR
+     for (x = 1; x <= NF; x++)
+          vector[x, NR] = $x
+@}
+@end group
+
+@group
+END @{
+     for (x = 1; x <= max_nf; x++) @{
+          for (y = max_nr; y >= 1; --y)
+               printf("%s ", vector[x, y])
+          printf("\n")
+     @}
+@}'
+@end group
+@end example
+
+@noindent
+When given the input:
+
+@example
+@group
+1 2 3 4 5 6
+2 3 4 5 6 1
+3 4 5 6 1 2
+4 5 6 1 2 3
+@end group
+@end example
+
+@noindent
+it produces:
+
+@example
+@group
+4 3 2 1
+5 4 3 2
+6 5 4 3
+1 6 5 4
+2 1 6 5
+3 2 1 6
+@end group
+@end example
+
+@node Multi-scanning,  , Multi-dimensional, Arrays
+@section Scanning Multi-dimensional Arrays
+
+There is no special @code{for} statement for scanning a
+``multi-dimensional'' array; there cannot be one, because in truth there
+are no multi-dimensional arrays or elements; there is only a
+multi-dimensional @emph{way of accessing} an array.
+
+However, if your program has an array that is always accessed as
+multi-dimensional, you can get the effect of scanning it by combining
+the scanning @code{for} statement
+(@pxref{Scanning an Array, ,Scanning All Elements of an Array}) with the
+@code{split} built-in function
+(@pxref{String Functions, ,Built-in Functions for String Manipulation}).
+It works like this:
+
+@example
+for (combined in array) @{
+  split(combined, separate, SUBSEP)
+  @dots{}
+@}
+@end example
+
+@noindent
+This sets @code{combined} to
+each concatenated, combined index in the array, and splits it
+into the individual indices by breaking it apart where the value of
+@code{SUBSEP} appears.  The split-out indices become the elements of
+the array @code{separate}.
+
+Thus, suppose you have previously stored a value in @code{array[1, "foo"]};
+then an element with index @code{"1\034foo"} exists in
+@code{array}.  (Recall that the default value of @code{SUBSEP} is
+the character with code 034.)  Sooner or later the @code{for} statement
+will find that index and do an iteration with @code{combined} set to
+@code{"1\034foo"}.  Then the @code{split} function is called as
+follows:
+
+@example
+split("1\034foo", separate, "\034")
+@end example
+
+@noindent
+The result of this is to set @code{separate[1]} to @code{"1"} and
+@code{separate[2]} to @code{"foo"}.  Presto, the original sequence of
+separate indices has been recovered.
+
+@node Built-in, User-defined, Arrays, Top
+@chapter Built-in Functions
+
+@c 2e: USE TEXINFO-2 FUNCTION DEFINITION STUFF!!!!!!!!!!!!!
+@cindex built-in functions
+@dfn{Built-in} functions are functions that are always available for
+your @code{awk} program to call.  This chapter defines all the built-in
+functions in @code{awk}; some of them are mentioned in other sections,
+but they are summarized here for your convenience.  (You can also define
+new functions yourself.  @xref{User-defined, ,User-defined Functions}.)
+
+@menu
+* Calling Built-in::            How to call built-in functions.
+* Numeric Functions::           Functions that work with numbers, including
+                                @code{int}, @code{sin} and @code{rand}.
+* String Functions::            Functions for string manipulation, such as
+                                @code{split}, @code{match}, and
+                                @code{sprintf}.
+* I/O Functions::               Functions for files and shell commands.
+* Time Functions::              Functions for dealing with time stamps.
+@end menu
+
+@node Calling Built-in, Numeric Functions, Built-in, Built-in
+@section Calling Built-in Functions
+
+To call a built-in function, write the name of the function followed
+by arguments in parentheses.  For example, @samp{atan2(y + z, 1)}
+is a call to the function @code{atan2}, with two arguments.
+
+Whitespace is ignored between the built-in function name and the
+open-parenthesis, but we recommend that you avoid using whitespace
+there.  User-defined functions do not permit whitespace in this way, and
+you will find it easier to avoid mistakes by following a simple
+convention which always works: no whitespace after a function name.
+
+@cindex differences between @code{gawk} and @code{awk}
+Each built-in function accepts a certain number of arguments.
+In some cases, arguments can be omitted. The defaults for omitted
+arguments vary from function to function and are described under the
+individual functions.  In some @code{awk} implementations, extra
+arguments given to built-in functions are ignored.  However, in @code{gawk},
+it is a fatal error to give extra arguments to a built-in function.
+
+When a function is called, expressions that create the function's actual
+parameters are evaluated completely before the function call is performed.
+For example, in the code fragment:
+
+@example
+i = 4
+j = sqrt(i++)
+@end example
+
+@noindent
+the variable @code{i} is set to five before @code{sqrt} is called
+with a value of four for its actual parameter.
+
+@cindex evaluation, order of
+@cindex order of evaluation
+The order of evaluation of the expressions used for the function's
+parameters is undefined.  Thus, you should not write programs that
+assume that parameters are evaluated from left to right or from
+right to left.  For example,
+
+@example
+i = 5
+j = atan2(i++, i *= 2)
+@end example
+
+If the order of evaluation is left to right, then @code{i} first becomes
+six, and then 12, and @code{atan2} is called with the two arguments six
+and 12.  But if the order of evaluation is right to left, @code{i}
+first becomes 10, and then 11, and @code{atan2} is called with the
+two arguments 11 and 10.
+
+@node Numeric Functions, String Functions, Calling Built-in, Built-in
+@section Numeric Built-in Functions
+
+Here is a full list of built-in functions that work with numbers.
+Optional parameters are enclosed in square brackets (``['' and ``]'').
+
+@table @code
+@item int(@var{x})
+@findex int
+This produces the nearest integer to @var{x}, located between @var{x} and zero,
+truncated toward zero.
+
+For example, @code{int(3)} is three, @code{int(3.9)} is three, @code{int(-3.9)}
+is @minus{}3, and @code{int(-3)} is @minus{}3 as well.
+
+@item sqrt(@var{x})
+@findex sqrt
+This gives you the positive square root of @var{x}.  It reports an error
+if @var{x} is negative.  Thus, @code{sqrt(4)} is two.
+
+@item exp(@var{x})
+@findex exp
+This gives you the exponential of @var{x} (@code{e ^ @var{x}}), or reports
+an error if @var{x} is out of range.  The range of values @var{x} can have
+depends on your machine's floating point representation.
+
+@item log(@var{x})
+@findex log
+This gives you the natural logarithm of @var{x}, if @var{x} is positive;
+otherwise, it reports an error.
+
+@item sin(@var{x})
+@findex sin
+This gives you the sine of @var{x}, with @var{x} in radians.
+
+@item cos(@var{x})
+@findex cos
+This gives you the cosine of @var{x}, with @var{x} in radians.
+
+@item atan2(@var{y}, @var{x})
+@findex atan2
+This gives you the arctangent of @code{@var{y} / @var{x}} in radians.
+
+@item rand()
+@findex rand
+This gives you a random number.  The values of @code{rand} are
+uniformly-distributed between zero and one.
+The value is never zero and never one.
+
+Often you want random integers instead.  Here is a user-defined function
+you can use to obtain a random non-negative integer less than @var{n}:
+
+@example
+function randint(n) @{
+     return int(n * rand())
+@}
+@end example
+
+@noindent
+The multiplication produces a random real number greater than zero and less
+than @code{n}.  We then make it an integer (using @code{int}) between zero
+and @code{n} @minus{} 1, inclusive.
+
+Here is an example where a similar function is used to produce
+random integers between one and @var{n}.  This program
+prints a new random number for each input record.
+
+@example
+@group
+awk '
+# Function to roll a simulated die.
+function roll(n) @{ return 1 + int(rand() * n) @}
+@end group
+
+@group
+# Roll 3 six-sided dice and
+# print total number of points.
+@{
+      printf("%d points\n",
+             roll(6)+roll(6)+roll(6))
+@}'
+@end group
+@end example
+
+@cindex seed for random numbers
+@cindex random numbers, seed of
+@comment MAWK uses a different seed each time.
+@strong{Caution:} In most @code{awk} implementations, including @code{gawk},
+@code{rand} starts generating numbers from the same
+starting number, or @dfn{seed}, each time you run @code{awk}.  Thus,
+a program will generate the same results each time you run it.
+The numbers are random within one @code{awk} run, but predictable
+from run to run.  This is convenient for debugging, but if you want
+a program to do different things each time it is used, you must change
+the seed to a value that will be different in each run.  To do this,
+use @code{srand}.
+
+@item srand(@r{[}@var{x}@r{]})
+@findex srand
+The function @code{srand} sets the starting point, or seed,
+for generating random numbers to the value @var{x}.
+
+Each seed value leads to a particular sequence of random
+numbers.@footnote{Computer generated random numbers really are not truly
+random.  They are technically known as ``pseudo-random.''  This means
+that while the numbers in a sequence appear to be random, you can in
+fact generate the same sequence of random numbers over and over again.}
+Thus, if you set the seed to the same value a second time, you will get
+the same sequence of random numbers again.
+
+If you omit the argument @var{x}, as in @code{srand()}, then the current
+date and time of day are used for a seed.  This is the way to get random
+numbers that are truly unpredictable.
+
+The return value of @code{srand} is the previous seed.  This makes it
+easy to keep track of the seeds for use in consistently reproducing
+sequences of random numbers.
+@end table
+
+@node String Functions, I/O Functions, Numeric Functions, Built-in
+@section Built-in Functions for String Manipulation
+
+The functions in this section look at or change the text of one or more
+strings.
+Optional parameters are enclosed in square brackets (``['' and ``]'').
+
+@table @code
+@item index(@var{in}, @var{find})
+@findex index
+This searches the string @var{in} for the first occurrence of the string
+@var{find}, and returns the position in characters where that occurrence
+begins in the string @var{in}.  For example:
+
+@example
+$ awk 'BEGIN @{ print index("peanut", "an") @}'
+@print{} 3
+@end example
+
+@noindent
+If @var{find} is not found, @code{index} returns zero.
+(Remember that string indices in @code{awk} start at one.)
+
+@item length(@r{[}@var{string}@r{]})
+@findex length
+This gives you the number of characters in @var{string}.  If
+@var{string} is a number, the length of the digit string representing
+that number is returned.  For example, @code{length("abcde")} is five.  By
+contrast, @code{length(15 * 35)} works out to three.  How?  Well, 15 * 35 =
+525, and 525 is then converted to the string @code{"525"}, which has
+three characters.
+
+If no argument is supplied, @code{length} returns the length of @code{$0}.
+
+@cindex historical features
+@cindex portability issues
+@cindex @code{awk} language, POSIX version
+@cindex POSIX @code{awk}
+In older versions of @code{awk}, you could call the @code{length} function
+without any parentheses.  Doing so is marked as ``deprecated'' in the
+POSIX standard.  This means that while you can do this in your
+programs, it is a feature that can eventually be removed from a future
+version of the standard.  Therefore, for maximal portability of your
+@code{awk} programs, you should always supply the parentheses.
+
+@item match(@var{string}, @var{regexp})
+@findex match
+The @code{match} function searches the string, @var{string}, for the
+longest, leftmost substring matched by the regular expression,
+@var{regexp}.  It returns the character position, or @dfn{index}, of
+where that substring begins (one, if it starts at the beginning of
+@var{string}).  If no match is found, it returns zero.
+
+@vindex RSTART
+@vindex RLENGTH
+The @code{match} function sets the built-in variable @code{RSTART} to
+the index.  It also sets the built-in variable @code{RLENGTH} to the
+length in characters of the matched substring.  If no match is found,
+@code{RSTART} is set to zero, and @code{RLENGTH} to @minus{}1.
+
+For example:
+
+@example
+@group
+@c file eg/misc/findpat.sh
+awk '@{
+       if ($1 == "FIND")
+         regex = $2
+       else @{
+         where = match($0, regex)
+         if (where != 0)
+           print "Match of", regex, "found at", \
+                     where, "in", $0
+       @}
+@}'
+@c endfile
+@end group
+@end example
+
+@noindent
+This program looks for lines that match the regular expression stored in
+the variable @code{regex}.  This regular expression can be changed.  If the
+first word on a line is @samp{FIND}, @code{regex} is changed to be the
+second word on that line.  Therefore, given:
+
+@example
+@c file eg/misc/findpat.data
+FIND ru+n
+My program runs
+but not very quickly
+FIND Melvin
+JF+KM
+This line is property of Reality Engineering Co.
+Melvin was here.
+@c endfile
+@end example
+
+@noindent
+@code{awk} prints:
+
+@example
+Match of ru+n found at 12 in My program runs
+Match of Melvin found at 1 in Melvin was here.
+@end example
+
+@item split(@var{string}, @var{array} @r{[}, @var{fieldsep}@r{]})
+@findex split
+This divides @var{string} into pieces separated by @var{fieldsep},
+and stores the pieces in @var{array}.  The first piece is stored in
+@code{@var{array}[1]}, the second piece in @code{@var{array}[2]}, and so
+forth.  The string value of the third argument, @var{fieldsep}, is
+a regexp describing where to split @var{string} (much as @code{FS} can
+be a regexp describing where to split input records).  If
+the @var{fieldsep} is omitted, the value of @code{FS} is used.
+@code{split} returns the number of elements created.
+
+The @code{split} function splits strings into pieces in a
+manner similar to the way input lines are split into fields.  For example:
+
+@example
+split("cul-de-sac", a, "-")
+@end example
+
+@noindent
+splits the string @samp{cul-de-sac} into three fields using @samp{-} as the
+separator.  It sets the contents of the array @code{a} as follows:
+
+@example
+a[1] = "cul"
+a[2] = "de"
+a[3] = "sac"
+@end example
+
+@noindent
+The value returned by this call to @code{split} is three.
+
+As with input field-splitting, when the value of @var{fieldsep} is
+@w{@code{" "}}, leading and trailing whitespace is ignored, and the elements
+are separated by runs of whitespace.
+
+@cindex differences between @code{gawk} and @code{awk}
+Also as with input field-splitting, if @var{fieldsep} is the null string, each
+individual character in the string is split into its own array element.
+(This is a @code{gawk}-specific extension.)
+
+@cindex dark corner
+Recent implementations of @code{awk}, including @code{gawk}, allow
+the third argument to be a regexp constant (@code{/abc/}), as well as a
+string (d.c.).  The POSIX standard allows this as well.
+
+Before splitting the string, @code{split} deletes any previously existing
+elements in the array @var{array} (d.c.).
+
+@item sprintf(@var{format}, @var{expression1},@dots{})
+@findex sprintf
+This returns (without printing) the string that @code{printf} would
+have printed out with the same arguments
+(@pxref{Printf, ,Using @code{printf} Statements for Fancier Printing}).
+For example:
+
+@example
+sprintf("pi = %.2f (approx.)", 22/7)
+@end example
+
+@noindent
+returns the string @w{@code{"pi = 3.14 (approx.)"}}.
+
+@ignore
+2e: For sub, gsub, and gensub, either here or in the "how much matches"
+    section, we need some explanation that it is possible to match the
+    null string when using closures like *.  E.g.,
+
+         $ echo abc | awk '{ gsub(/m*/, "X"); print }'
+         @print{} XaXbXc
+
+    Although this makes a certain amount of sense, it can be very
+    suprising.
+@end ignore
+
+@item sub(@var{regexp}, @var{replacement} @r{[}, @var{target}@r{]})
+@findex sub
+The @code{sub} function alters the value of @var{target}.
+It searches this value, which is treated as a string, for the
+leftmost longest substring matched by the regular expression, @var{regexp},
+extending this match as far as possible.  Then the entire string is
+changed by replacing the matched text with @var{replacement}.
+The modified string becomes the new value of @var{target}.
+
+This function is peculiar because @var{target} is not simply
+used to compute a value, and not just any expression will do: it
+must be a variable, field or array element, so that @code{sub} can
+store a modified value there.  If this argument is omitted, then the
+default is to use and alter @code{$0}.
+
+For example:
+
+@example
+str = "water, water, everywhere"
+sub(/at/, "ith", str)
+@end example
+
+@noindent
+sets @code{str} to @w{@code{"wither, water, everywhere"}}, by replacing the
+leftmost, longest occurrence of @samp{at} with @samp{ith}.
+
+The @code{sub} function returns the number of substitutions made (either
+one or zero).
+
+If the special character @samp{&} appears in @var{replacement}, it
+stands for the precise substring that was matched by @var{regexp}.  (If
+the regexp can match more than one string, then this precise substring
+may vary.)  For example:
+
+@example
+awk '@{ sub(/candidate/, "& and his wife"); print @}'
+@end example
+
+@noindent
+changes the first occurrence of @samp{candidate} to @samp{candidate
+and his wife} on each input line.
+
+Here is another example:
+
+@example
+awk 'BEGIN @{
+        str = "daabaaa"
+        sub(/a*/, "c&c", str)
+        print str
+@}'
+@print{} dcaacbaaa
+@end example
+
+@noindent
+This shows how @samp{&} can represent a non-constant string, and also
+illustrates the ``leftmost, longest'' rule in regexp matching
+(@pxref{Leftmost Longest, ,How Much Text Matches?}).
+
+The effect of this special character (@samp{&}) can be turned off by putting a
+backslash before it in the string.  As usual, to insert one backslash in
+the string, you must write two backslashes.  Therefore, write @samp{\\&}
+in a string constant to include a literal @samp{&} in the replacement.
+For example, here is how to replace the first @samp{|} on each line with
+an @samp{&}:
+
+@example
+awk '@{ sub(/\|/, "\\&"); print @}'
+@end example
+
+@strong{Note:} As mentioned above, the third argument to @code{sub} must
+be a variable, field or array reference.
+Some versions of @code{awk} allow the third argument to
+be an expression which is not an lvalue.  In such a case, @code{sub}
+would still search for the pattern and return zero or one, but the result of
+the substitution (if any) would be thrown away because there is no place
+to put it.  Such versions of @code{awk} accept expressions like
+this:
+
+@example
+sub(/USA/, "United States", "the USA and Canada")
+@end example
+
+@noindent
+This is considered erroneous in @code{gawk}.
+
+@item gsub(@var{regexp}, @var{replacement} @r{[}, @var{target}@r{]})
+@findex gsub
+This is similar to the @code{sub} function, except @code{gsub} replaces
+@emph{all} of the longest, leftmost, @emph{non-overlapping} matching
+substrings it can find.  The @samp{g} in @code{gsub} stands for
+``global,'' which means replace everywhere.  For example:
+
+@example
+awk '@{ gsub(/Britain/, "United Kingdom"); print @}'
+@end example
+
+@noindent
+replaces all occurrences of the string @samp{Britain} with @samp{United
+Kingdom} for all input records.
+
+The @code{gsub} function returns the number of substitutions made.  If
+the variable to be searched and altered, @var{target}, is
+omitted, then the entire input record, @code{$0}, is used.
+
+As in @code{sub}, the characters @samp{&} and @samp{\} are special,
+and the third argument must be an lvalue.
+@end table
+
+@table @code
+@item gensub(@var{regexp}, @var{replacement}, @var{how} @r{[}, @var{target}@r{]})
+@findex gensub
+@code{gensub} is a general substitution function.  Like @code{sub} and
+@code{gsub}, it searches the target string @var{target} for matches of
+the regular expression @var{regexp}.  Unlike @code{sub} and
+@code{gsub}, the modified string is returned as the result of the
+function, and the original target string is @emph{not} changed.  If
+@var{how} is a string beginning with @samp{g} or @samp{G}, then it
+replaces all matches of @var{regexp} with @var{replacement}.
+Otherwise, @var{how} is a number indicating which match of @var{regexp}
+to replace. If no @var{target} is supplied, @code{$0} is used instead.
+
+@code{gensub} provides an additional feature that is not available
+in @code{sub} or @code{gsub}: the ability to specify components of
+a regexp in the replacement text.  This is done by using parentheses
+in the regexp to mark the components, and then specifying @samp{\@var{n}}
+in the replacement text, where @var{n} is a digit from one to nine.
+For example:
+
+@example
+@group
+$ gawk '
+> BEGIN @{
+>      a = "abc def"
+>      b = gensub(/(.+) (.+)/, "\\2 \\1", "g", a)
+>      print b
+> @}'
+@print{} def abc
+@end group
+@end example
+
+@noindent
+As described above for @code{sub}, you must type two backslashes in order
+to get one into the string.
+
+In the replacement text, the sequence @samp{\0} represents the entire
+matched text, as does the character @samp{&}.
+
+This example shows how you can use the third argument to control
+which match of the regexp should be changed.
+
+@example
+$ echo a b c a b c |
+> gawk '@{ print gensub(/a/, "AA", 2) @}'
+@print{} a b c AA b c
+@end example
+
+In this case, @code{$0} is used as the default target string.
+@code{gensub} returns the new string as its result, which is
+passed directly to @code{print} for printing.
+
+If the @var{how} argument is a string that does not begin with @samp{g} or
+@samp{G}, or if it is a number that is less than zero, only one
+substitution is performed.
+
+@cindex differences between @code{gawk} and @code{awk}
+@code{gensub} is a @code{gawk} extension; it is not available
+in compatibility mode (@pxref{Options, ,Command Line Options}).
+
+@item substr(@var{string}, @var{start} @r{[}, @var{length}@r{]})
+@findex substr
+This returns a @var{length}-character-long substring of @var{string},
+starting at character number @var{start}.  The first character of a
+string is character number one.  For example,
+@code{substr("washington", 5, 3)} returns @code{"ing"}.
+
+If @var{length} is not present, this function returns the whole suffix of
+@var{string} that begins at character number @var{start}.  For example,
+@code{substr("washington", 5)} returns @code{"ington"}.  The whole
+suffix is also returned
+if @var{length} is greater than the number of characters remaining
+in the string, counting from character number @var{start}.
+
+@cindex case conversion
+@cindex conversion of case
+@item tolower(@var{string})
+@findex tolower
+This returns a copy of @var{string}, with each upper-case character
+in the string replaced with its corresponding lower-case character.
+Non-alphabetic characters are left unchanged.  For example,
+@code{tolower("MiXeD cAsE 123")} returns @code{"mixed case 123"}.
+
+@item toupper(@var{string})
+@findex toupper
+This returns a copy of @var{string}, with each lower-case character
+in the string replaced with its corresponding upper-case character.
+Non-alphabetic characters are left unchanged.  For example,
+@code{toupper("MiXeD cAsE 123")} returns @code{"MIXED CASE 123"}.
+@end table
+
+@c fakenode --- for prepinfo
+@subheading More About @samp{\} and @samp{&} with @code{sub}, @code{gsub} and @code{gensub}
+
+@cindex escape processing, @code{sub} et. al.
+When using @code{sub}, @code{gsub} or @code{gensub}, and trying to get literal
+backslashes and ampersands into the replacement text, you need to remember
+that there are several levels of @dfn{escape processing} going on.
+
+First, there is the @dfn{lexical} level, which is when @code{awk} reads
+your program, and builds an internal copy of your program that can
+be executed.
+
+Then there is the run-time level, when @code{awk} actually scans the
+replacement string to determine what to generate.
+
+At both levels, @code{awk} looks for a defined set of characters that
+can come after a backslash.  At the lexical level, it looks for the
+escape sequences listed in @ref{Escape Sequences}.
+Thus, for every @samp{\} that @code{awk} will process at the run-time
+level, you type two @samp{\}s at the lexical level.
+When a character that is not valid for an escape sequence follows the
+@samp{\}, Unix @code{awk} and @code{gawk} both simply remove the initial
+@samp{\}, and put the following character into the string. Thus, for
+example, @code{"a\qb"} is treated as @code{"aqb"}.
+
+At the run-time level, the various functions handle sequences of
+@samp{\} and @samp{&} differently.  The situation is (sadly) somewhat complex.
+
+Historically, the @code{sub} and @code{gsub} functions treated the two
+character sequence @samp{\&} specially; this sequence was replaced in
+the generated text with a single @samp{&}.  Any other @samp{\} within
+the @var{replacement} string that did not precede an @samp{&} was passed
+through unchanged.  To illustrate with a table:
+
+@c Thank to Karl Berry for help with the TeX stuff.
+@tex
+\vbox{\bigskip
+% This table has lots of &'s and \'s, so unspecialize them.
+\catcode`\& = \other \catcode`\\ = \other
+% But then we need character for escape and tab.
+@catcode`! = 4
+@halign{@hfil#!@qquad@hfil#!@qquad#@hfil@cr
+    You type!@code{sub} sees!@code{sub} generates@cr
+@hrulefill!@hrulefill!@hrulefill@cr
+   @code{\&}!       @code{&}!the matched text@cr
+  @code{\\&}!      @code{\&}!a literal @samp{&}@cr
+ @code{\\\&}!      @code{\&}!a literal @samp{&}@cr
+@code{\\\\&}!     @code{\\&}!a literal @samp{\&}@cr
+@code{\\\\\&}!     @code{\\&}!a literal @samp{\&}@cr
+@code{\\\\\\&}!     @code{\\\&}!a literal @samp{\\&}@cr
+  @code{\\q}!      @code{\q}!a literal @samp{\q}@cr
+}
+@bigskip}
+@end tex
+@ifinfo
+@display
+ You type         @code{sub} sees          @code{sub} generates
+ --------         ----------          ---------------
+     @code{\&}              @code{&}            the matched text
+    @code{\\&}             @code{\&}            a literal @samp{&}
+   @code{\\\&}             @code{\&}            a literal @samp{&}
+  @code{\\\\&}            @code{\\&}            a literal @samp{\&}
+ @code{\\\\\&}            @code{\\&}            a literal @samp{\&}
+@code{\\\\\\&}           @code{\\\&}            a literal @samp{\\&}
+    @code{\\q}             @code{\q}            a literal @samp{\q}
+@end display
+@end ifinfo
+
+@noindent
+This table shows both the lexical level processing, where
+an odd number of backslashes becomes an even number at the run time level,
+and the run-time processing done by @code{sub}.
+(For the sake of simplicity, the rest of the tables below only show the
+case of even numbers of @samp{\}s entered at the lexical level.)
+
+The problem with the historical approach is that there is no way to get
+a literal @samp{\} followed by the matched text.
+
+@cindex @code{awk} language, POSIX version
+@cindex POSIX @code{awk}
+The 1992 POSIX standard attempted to fix this problem. The standard
+says that @code{sub} and @code{gsub} look for either a @samp{\} or an @samp{&}
+after the @samp{\}. If either one follows a @samp{\}, that character is
+output literally.  The interpretation of @samp{\} and @samp{&} then becomes
+like this:
+
+@c thanks to Karl Berry for formatting this table
+@tex
+\vbox{\bigskip
+% This table has lots of &'s and \'s, so unspecialize them.
+\catcode`\& = \other \catcode`\\ = \other
+% But then we need character for escape and tab.
+@catcode`! = 4
+@halign{@hfil#!@qquad@hfil#!@qquad#@hfil@cr
+    You type!@code{sub} sees!@code{sub} generates@cr
+@hrulefill!@hrulefill!@hrulefill@cr
+    @code{&}!       @code{&}!the matched text@cr
+  @code{\\&}!      @code{\&}!a literal @samp{&}@cr
+@code{\\\\&}!     @code{\\&}!a literal @samp{\}, then the matched text@cr
+@code{\\\\\\&}!  @code{\\\&}!a literal @samp{\&}@cr
+}
+@bigskip}
+@end tex
+@ifinfo
+@display
+ You type         @code{sub} sees          @code{sub} generates
+ --------         ----------          ---------------
+      @code{&}              @code{&}            the matched text
+    @code{\\&}             @code{\&}            a literal @samp{&}
+  @code{\\\\&}            @code{\\&}            a literal @samp{\}, then the matched text
+@code{\\\\\\&}           @code{\\\&}            a literal @samp{\&}
+@end display
+@end ifinfo
+
+@noindent
+This would appear to solve the problem.
+Unfortunately, the phrasing of the standard is unusual. It
+says, in effect, that @samp{\} turns off the special meaning of any
+following character, but that for anything other than @samp{\} and @samp{&},
+such special meaning is undefined.  This wording leads to two problems.
+
+@enumerate
+@item
+Backslashes must now be doubled in the @var{replacement} string, breaking
+historical @code{awk} programs.
+
+@item
+To make sure that an @code{awk} program is portable, @emph{every} character
+in the @var{replacement} string must be preceded with a
+backslash.@footnote{This consequence was certainly unintended.}
+@c I can say that, 'cause I was involved in making this change
+@end enumerate
+
+The POSIX standard is under revision.@footnote{As of December 1995,
+with final approval and publication hopefully sometime in 1996.}
+Because of the above problems, proposed text for the revised standard
+reverts to rules that correspond more closely to the original existing
+practice. The proposed rules have special cases that make it possible
+to produce a @samp{\} preceding the matched text.
+
+@tex
+\vbox{\bigskip
+% This table has lots of &'s and \'s, so unspecialize them.
+\catcode`\& = \other \catcode`\\ = \other
+% But then we need character for escape and tab.
+@catcode`! = 4
+@halign{@hfil#!@qquad@hfil#!@qquad#@hfil@cr
+    You type!@code{sub} sees!@code{sub} generates@cr
+@hrulefill!@hrulefill!@hrulefill@cr
+@code{\\\\\\&}!     @code{\\\&}!a literal @samp{\&}@cr
+@code{\\\\&}!     @code{\\&}!a literal @samp{\}, followed by the matched text@cr
+  @code{\\&}!      @code{\&}!a literal @samp{&}@cr
+  @code{\\q}!      @code{\q}!a literal @samp{\q}@cr
+}
+@bigskip}
+@end tex
+@ifinfo
+@display
+ You type         @code{sub} sees         @code{sub} generates
+ --------         ----------         ---------------
+@code{\\\\\\&}           @code{\\\&}            a literal @samp{\&}
+  @code{\\\\&}            @code{\\&}            a literal @samp{\}, followed by the matched text
+    @code{\\&}             @code{\&}            a literal @samp{&}
+    @code{\\q}             @code{\q}            a literal @samp{\q}
+@end display
+@end ifinfo
+
+In a nutshell, at the run-time level, there are now three special sequences
+of characters, @samp{\\\&}, @samp{\\&} and @samp{\&}, whereas historically,
+there was only one.  However, as in the historical case, any @samp{\} that
+is not part of one of these three sequences is not special, and appears
+in the output literally.
+
+@code{gawk} 3.0 follows these proposed POSIX rules for @code{sub} and
+@code{gsub}.
+@c As much as we think it's a lousy idea. You win some, you lose some. Sigh.
+Whether these proposed rules will actually become codified into the
+standard is unknown at this point. Subsequent @code{gawk} releases will
+track the standard and implement whatever the final version specifies;
+this @value{DOCUMENT} will be updated as well.
+
+The rules for @code{gensub} are considerably simpler. At the run-time
+level, whenever @code{gawk} sees a @samp{\}, if the following character
+is a digit, then the text that matched the corresponding parenthesized
+subexpression is placed in the generated output.  Otherwise,
+no matter what the character after the @samp{\} is, that character will
+appear in the generated text, and the @samp{\} will not.
+
+@tex
+\vbox{\bigskip
+% This table has lots of &'s and \'s, so unspecialize them.
+\catcode`\& = \other \catcode`\\ = \other
+% But then we need character for escape and tab.
+@catcode`! = 4
+@halign{@hfil#!@qquad@hfil#!@qquad#@hfil@cr
+    You type!@code{gensub} sees!@code{gensub} generates@cr
+@hrulefill!@hrulefill!@hrulefill@cr
+      @code{&}!           @code{&}!the matched text@cr
+    @code{\\&}!          @code{\&}!a literal @samp{&}@cr
+   @code{\\\\}!          @code{\\}!a literal @samp{\}@cr
+  @code{\\\\&}!         @code{\\&}!a literal @samp{\}, then the matched text@cr
+@code{\\\\\\&}!        @code{\\\&}!a literal @samp{\&}@cr
+    @code{\\q}!          @code{\q}!a literal @samp{q}@cr
+}
+@bigskip}
+@end tex
+@ifinfo
+@display
+  You type          @code{gensub} sees         @code{gensub} generates
+  --------          -------------         ------------------
+      @code{&}                    @code{&}            the matched text
+    @code{\\&}                   @code{\&}            a literal @samp{&}
+   @code{\\\\}                   @code{\\}            a literal @samp{\}
+  @code{\\\\&}                  @code{\\&}            a literal @samp{\}, then the matched text
+@code{\\\\\\&}                 @code{\\\&}            a literal @samp{\&}
+    @code{\\q}                   @code{\q}            a literal @samp{q}
+@end display
+@end ifinfo
+
+Because of the complexity of the lexical and run-time level processing,
+and the special cases for @code{sub} and @code{gsub},
+we recommend the use of @code{gawk} and @code{gensub} for when you have
+to do substitutions.
+
+@node I/O Functions, Time Functions, String Functions, Built-in
+@section Built-in Functions for Input/Output
+
+The following functions are related to Input/Output (I/O).
+Optional parameters are enclosed in square brackets (``['' and ``]'').
+
+@table @code
+@item close(@var{filename})
+@findex close
+Close the file @var{filename}, for input or output.  The argument may
+alternatively be a shell command that was used for redirecting to or
+from a pipe; then the pipe is closed.
+@xref{Close Files And Pipes, ,Closing Input and Output Files and Pipes},
+for more information.
+
+@item fflush(@r{[}@var{filename}@r{]})
+@findex fflush
+@cindex portability issues
+@cindex flushing buffers
+@cindex buffers, flushing
+@cindex buffering output
+@cindex output, buffering
+Flush any buffered output associated @var{filename}, which is either a
+file opened for writing, or a shell command for redirecting output to
+a pipe.
+
+Many utility programs will @dfn{buffer} their output; they save information
+to be written to a disk file or terminal in memory, until there is enough
+for it to be worthwhile to send the data to the ouput device.
+This is often more efficient than writing
+every little bit of information as soon as it is ready.  However, sometimes
+it is necessary to force a program to @dfn{flush} its buffers; that is,
+write the information to its destination, even if a buffer is not full.
+This is the purpose of the @code{fflush} function; @code{gawk} too
+buffers its output, and the @code{fflush} function can be used to force
+@code{gawk} to flush its buffers.
+
+@code{fflush} is a recent (1994) addition to the Bell Labs research
+version of @code{awk}; it is not part of the POSIX standard, and will
+not be available if @samp{--posix} has been specified on the command
+line (@pxref{Options, ,Command Line Options}).
+
+@code{gawk} extends the @code{fflush} function in two ways.  This first
+is to allow no argument at all. In this case, the buffer for the
+standard output is flushed.  The second way is to allow the null string
+(@w{@code{""}}) as the argument. In this case, the buffers for
+@emph{all} open output files and pipes are flushed.
+
+@code{fflush} returns zero if the buffer was successfully flushed,
+and nonzero otherwise.
+
+@item system(@var{command})
+@findex system
+@cindex interaction, @code{awk} and other programs
+The system function allows the user to execute operating system commands
+and then return to the @code{awk} program.  The @code{system} function
+executes the command given by the string @var{command}.  It returns, as
+its value, the status returned by the command that was executed.
+
+For example, if the following fragment of code is put in your @code{awk}
+program:
+
+@example
+END @{
+     system("date | mail -s 'awk run done' root")
+@}
+@end example
+
+@noindent
+the system administrator will be sent mail when the @code{awk} program
+finishes processing input and begins its end-of-input processing.
+
+Note that redirecting @code{print} or @code{printf} into a pipe is often
+enough to accomplish your task.  However, if your @code{awk}
+program is interactive, @code{system} is useful for cranking up large
+self-contained programs, such as a shell or an editor.
+
+Some operating systems cannot implement the @code{system} function.
+@code{system} causes a fatal error if it is not supported.
+@end table
+
+@c fakenode --- for prepinfo
+@subheading Controlling Output Buffering with @code{system}
+@cindex flushing buffers
+@cindex buffers, flushing
+@cindex buffering output
+@cindex output, buffering
+
+The @code{fflush} function provides explicit control over output buffering for
+individual files and pipes.  However, its use is not portable to many other
+@code{awk} implementations.  An alternative method to flush output
+buffers is by calling @code{system} with a null string as its argument:
+
+@example
+system("")   # flush output
+@end example
+
+@noindent
+@code{gawk} treats this use of the @code{system} function as a special
+case, and is smart enough not to run a shell (or other command
+interpreter) with the empty command.  Therefore, with @code{gawk}, this
+idiom is not only useful, it is efficient.  While this method should work
+with other @code{awk} implementations, it will not necessarily avoid
+starting an unnecessary shell.  (Other implementations may only
+flush the buffer associated with the standard output, and not necessarily
+all buffered output.)
+
+If you think about what a programmer expects, it makes sense that
+@code{system} should flush any pending output.  The following program:
+
+@example
+BEGIN @{
+     print "first print"
+     system("echo system echo")
+     print "second print"
+@}
+@end example
+
+@noindent
+must print
+
+@example
+first print
+system echo
+second print
+@end example
+
+@noindent
+and not
+
+@example
+system echo
+first print
+second print
+@end example
+
+If @code{awk} did not flush its buffers before calling @code{system}, the
+latter (undesirable) output is what you would see.
+
+@node Time Functions,  , I/O Functions, Built-in
+@section Functions for Dealing with Time Stamps
+
+@cindex timestamps
+@cindex time of day
+A common use for @code{awk} programs is the processing of log files
+containing time stamp information, indicating when a
+particular log record was written.  Many programs log their time stamp
+in the form returned by the @code{time} system call, which is the
+number of seconds since a particular epoch.  On POSIX systems,
+it is the number of seconds since Midnight, January 1, 1970, UTC.
+
+In order to make it easier to process such log files, and to produce
+useful reports, @code{gawk} provides two functions for working with time
+stamps.  Both of these are @code{gawk} extensions; they are not specified
+in the POSIX standard, nor are they in any other known version
+of @code{awk}.
+
+Optional parameters are enclosed in square brackets (``['' and ``]'').
+
+@table @code
+@item systime()
+@findex systime
+This function returns the current time as the number of seconds since
+the system epoch.  On POSIX systems, this is the number of seconds
+since Midnight, January 1, 1970, UTC.  It may be a different number on
+other systems.
+
+@item strftime(@r{[}@var{format} @r{[}, @var{timestamp}@r{]]})
+@findex strftime
+This function returns a string.  It is similar to the function of the
+same name in ANSI C.  The time specified by @var{timestamp} is used to
+produce a string, based on the contents of the @var{format} string.
+The @var{timestamp} is in the same format as the value returned by the
+@code{systime} function.  If no @var{timestamp} argument is supplied,
+@code{gawk} will use the current time of day as the time stamp.
+If no @var{format} argument is supplied, @code{strftime} uses
+@code{@w{"%a %b %d %H:%M:%S %Z %Y"}}.  This format string produces
+output (almost) equivalent to that of the @code{date} utility.
+(Versions of @code{gawk} prior to 3.0 require the @var{format} argument.)
+@end table
+
+The @code{systime} function allows you to compare a time stamp from a
+log file with the current time of day.  In particular, it is easy to
+determine how long ago a particular record was logged.  It also allows
+you to produce log records using the ``seconds since the epoch'' format.
+
+The @code{strftime} function allows you to easily turn a time stamp
+into human-readable information.  It is similar in nature to the @code{sprintf}
+function
+(@pxref{String Functions, ,Built-in Functions for String Manipulation}),
+in that it copies non-format specification characters verbatim to the
+returned string, while substituting date and time values for format
+specifications in the @var{format} string.
+
+@code{strftime} is guaranteed by the ANSI C standard to support
+the following date format specifications:
+
+@table @code
+@item %a
+The locale's abbreviated weekday name.
+
+@item %A
+The locale's full weekday name.
+
+@item %b
+The locale's abbreviated month name.
+
+@item %B
+The locale's full month name.
+
+@item %c
+The locale's ``appropriate'' date and time representation.
+
+@item %d
+The day of the month as a decimal number (01--31).
+
+@item %H
+The hour (24-hour clock) as a decimal number (00--23).
+
+@item %I
+The hour (12-hour clock) as a decimal number (01--12).
+
+@item %j
+The day of the year as a decimal number (001--366).
+
+@item %m
+The month as a decimal number (01--12).
+
+@item %M
+The minute as a decimal number (00--59).
+
+@item %p
+The locale's equivalent of the AM/PM designations associated
+with a 12-hour clock.
+
+@item %S
+The second as a decimal number (00--61).@footnote{Occasionally there are
+minutes in a year with one or two leap seconds, which is why the
+seconds can go up to 61.}
+
+@item %U
+The week number of the year (the first Sunday as the first day of week one)
+as a decimal number (00--53).
+
+@item %w
+The weekday as a decimal number (0--6).  Sunday is day zero.
+
+@item %W
+The week number of the year (the first Monday as the first day of week one)
+as a decimal number (00--53).
+
+@item %x
+The locale's ``appropriate'' date representation.
+
+@item %X
+The locale's ``appropriate'' time representation.
+
+@item %y
+The year without century as a decimal number (00--99).
+
+@item %Y
+The year with century as a decimal number (e.g., 1995).
+
+@item %Z
+The time zone name or abbreviation, or no characters if
+no time zone is determinable.
+
+@item %%
+A literal @samp{%}.
+@end table
+
+If a conversion specifier is not one of the above, the behavior is
+undefined.@footnote{This is because ANSI C leaves the
+behavior of the C version of @code{strftime} undefined, and @code{gawk}
+will use the system's version of @code{strftime} if it's there.
+Typically, the conversion specifier will either not appear in the
+returned string, or it will appear literally.}
+
+@cindex locale, definition of
+Informally, a @dfn{locale} is the geographic place in which a program
+is meant to run.  For example, a common way to abbreviate the date
+September 4, 1991 in the United States would be ``9/4/91''.
+In many countries in Europe, however, it would be abbreviated ``4.9.91''.
+Thus, the @samp{%x} specification in a @code{"US"} locale might produce
+@samp{9/4/91}, while in a @code{"EUROPE"} locale, it might produce
+@samp{4.9.91}.  The ANSI C standard defines a default @code{"C"}
+locale, which is an environment that is typical of what most C programmers
+are used to.
+
+A public-domain C version of @code{strftime} is supplied with @code{gawk}
+for systems that are not yet fully ANSI-compliant.  If that version is
+used to compile @code{gawk} (@pxref{Installation, ,Installing @code{gawk}}),
+then the following additional format specifications are available:
+
+@table @code
+@item %D
+Equivalent to specifying @samp{%m/%d/%y}.
+
+@item %e
+The day of the month, padded with a space if it is only one digit.
+
+@item %h
+Equivalent to @samp{%b}, above.
+
+@item %n
+A newline character (ASCII LF).
+
+@item %r
+Equivalent to specifying @samp{%I:%M:%S %p}.
+
+@item %R
+Equivalent to specifying @samp{%H:%M}.
+
+@item %T
+Equivalent to specifying @samp{%H:%M:%S}.
+
+@item %t
+A tab character.
+
+@item %k
+The hour (24-hour clock) as a decimal number (0-23).
+Single digit numbers are padded with a space.
+
+@item %l
+The hour (12-hour clock) as a decimal number (1-12).
+Single digit numbers are padded with a space.
+
+@item %C
+The century, as a number between 00 and 99.
+
+@item %u
+The weekday as a decimal number
+[1 (Monday)--7].
+
+@cindex ISO 8601
+@item %V
+The week number of the year (the first Monday as the first
+day of week one) as a decimal number (01--53).
+The method for determining the week number is as specified by ISO 8601
+(to wit: if the week containing January 1 has four or more days in the
+new year, then it is week one, otherwise it is week 53 of the previous year
+and the next week is week one).
+
+@item %G
+The year with century of the ISO week number, as a decimal number.
+
+For example, January 1, 1993, is in week 53 of 1992. Thus, the year
+of its ISO week number is 1992, even though its year is 1993.
+Similarly, December 31, 1973, is in week 1 of 1974. Thus, the year
+of its ISO week number is 1974, even though its year is 1973.
+
+@item %g
+The year without century of the ISO week number, as a decimal number (00--99).
+
+@item %Ec %EC %Ex %Ey %EY %Od %Oe %OH %OI
+@itemx %Om %OM %OS %Ou %OU %OV %Ow %OW %Oy
+These are ``alternate representations'' for the specifications
+that use only the second letter (@samp{%c}, @samp{%C}, and so on).
+They are recognized, but their normal representations are
+used.@footnote{If you don't understand any of this, don't worry about
+it; these facilities are meant to make it easier to ``internationalize''
+programs.}
+(These facilitate compliance with the POSIX @code{date} utility.)
+
+@item %v
+The date in VMS format (e.g., 20-JUN-1991).
+
+@cindex RFC-822
+@cindex RFC-1036
+@item %z
+The timezone offset in a +HHMM format (e.g., the format necessary to
+produce RFC-822/RFC-1036 date headers).
+@end table
+
+This example is an @code{awk} implementation of the POSIX
+@code{date} utility.  Normally, the @code{date} utility prints the
+current date and time of day in a well known format.  However, if you
+provide an argument to it that begins with a @samp{+}, @code{date}
+will copy non-format specifier characters to the standard output, and
+will interpret the current time according to the format specifiers in
+the string.  For example:
+
+@example
+$ date '+Today is %A, %B %d, %Y.'
+@print{} Today is Thursday, July 11, 1991.
+@end example
+
+Here is the @code{gawk} version of the @code{date} utility.
+It has a shell ``wrapper'', to handle the @samp{-u} option,
+which requires that @code{date} run as if the time zone
+was set to UTC.
+
+@example
+@group
+#! /bin/sh
+#
+# date --- approximate the P1003.2 'date' command
+
+case $1 in
+-u)  TZ=GMT0     # use UTC
+     export TZ
+     shift ;;
+esac
+@end group
+
+@group
+gawk 'BEGIN  @{
+    format = "%a %b %d %H:%M:%S %Z %Y"
+    exitval = 0
+@end group
+
+@group
+    if (ARGC > 2)
+        exitval = 1
+    else if (ARGC == 2) @{
+        format = ARGV[1]
+        if (format ~ /^\+/)
+            format = substr(format, 2)   # remove leading +
+    @}
+    print strftime(format)
+    exit exitval
+@}' "$@@"
+@end group
+@end example
+
+@node User-defined, Invoking Gawk, Built-in, Top
+@chapter User-defined Functions
+
+@cindex user-defined functions
+@cindex functions, user-defined
+Complicated @code{awk} programs can often be simplified by defining
+your own functions.  User-defined functions can be called just like
+built-in ones (@pxref{Function Calls}), but it is up to you to define
+them---to tell @code{awk} what they should do.
+
+@menu
+* Definition Syntax::           How to write definitions and what they mean.
+* Function Example::            An example function definition and what it
+                                does.
+* Function Caveats::            Things to watch out for.
+* Return Statement::            Specifying the value a function returns.
+@end menu
+
+@node Definition Syntax, Function Example, User-defined, User-defined
+@section Function Definition Syntax
+@cindex defining functions
+@cindex function definition
+
+Definitions of functions can appear anywhere between the rules of an
+@code{awk} program.  Thus, the general form of an @code{awk} program is
+extended to include sequences of rules @emph{and} user-defined function
+definitions.
+There is no need in @code{awk} to put the definition of a function
+before all uses of the function.  This is because @code{awk} reads the
+entire program before starting to execute any of it.
+
+The definition of a function named @var{name} looks like this:
+
+@example
+function @var{name}(@var{parameter-list})
+@{
+     @var{body-of-function}
+@}
+@end example
+
+@cindex names, use of
+@cindex namespaces
+@noindent
+@var{name} is the name of the function to be defined.  A valid function
+name is like a valid variable name: a sequence of letters, digits and
+underscores, not starting with a digit.
+Within a single @code{awk} program, any particular name can only be
+used as a variable, array or function.
+
+@var{parameter-list} is a list of the function's arguments and local
+variable names, separated by commas.  When the function is called,
+the argument names are used to hold the argument values given in
+the call.  The local variables are initialized to the empty string.
+A function cannot have two parameters with the same name.
+
+The @var{body-of-function} consists of @code{awk} statements.  It is the
+most important part of the definition, because it says what the function
+should actually @emph{do}.  The argument names exist to give the body a
+way to talk about the arguments; local variables, to give the body
+places to keep temporary values.
+
+Argument names are not distinguished syntactically from local variable
+names; instead, the number of arguments supplied when the function is
+called determines how many argument variables there are.  Thus, if three
+argument values are given, the first three names in @var{parameter-list}
+are arguments, and the rest are local variables.
+
+It follows that if the number of arguments is not the same in all calls
+to the function, some of the names in @var{parameter-list} may be
+arguments on some occasions and local variables on others.  Another
+way to think of this is that omitted arguments default to the
+null string.
+
+Usually when you write a function you know how many names you intend to
+use for arguments and how many you intend to use as local variables.  It is
+conventional to place some extra space between the arguments and
+the local variables, to document how your function is supposed to be used.
+
+@cindex variable shadowing
+During execution of the function body, the arguments and local variable
+values hide or @dfn{shadow} any variables of the same names used in the
+rest of the program.  The shadowed variables are not accessible in the
+function definition, because there is no way to name them while their
+names have been taken away for the local variables.  All other variables
+used in the @code{awk} program can be referenced or set normally in the
+function's body.
+
+The arguments and local variables last only as long as the function body
+is executing.  Once the body finishes, you can once again access the
+variables that were shadowed while the function was running.
+
+@cindex recursive function
+@cindex function, recursive
+The function body can contain expressions which call functions.  They
+can even call this function, either directly or by way of another
+function.  When this happens, we say the function is @dfn{recursive}.
+
+@cindex @code{awk} language, POSIX version
+@cindex POSIX @code{awk}
+In many @code{awk} implementations, including @code{gawk},
+the keyword @code{function} may be
+abbreviated @code{func}.  However, POSIX only specifies the use of
+the keyword @code{function}.  This actually has some practical implications.
+If @code{gawk} is in POSIX-compatibility mode
+(@pxref{Options, ,Command Line Options}), then the following
+statement will @emph{not} define a function:
+
+@example
+func foo() @{ a = sqrt($1) ; print a @}
+@end example
+
+@noindent
+Instead it defines a rule that, for each record, concatenates the value
+of the variable @samp{func} with the return value of the function @samp{foo}.
+If the resulting string is non-null, the action is executed.
+This is probably not what was desired.  (@code{awk} accepts this input as
+syntactically valid, since functions may be used before they are defined
+in @code{awk} programs.)
+
+@cindex portability issues
+To ensure that your @code{awk} programs are portable, always use the
+keyword @code{function} when defining a function.
+
+@node Function Example, Function Caveats, Definition Syntax, User-defined
+@section Function Definition Examples
+
+Here is an example of a user-defined function, called @code{myprint}, that
+takes a number and prints it in a specific format.
+
+@example
+function myprint(num)
+@{
+     printf "%6.3g\n", num
+@}
+@end example
+
+@noindent
+To illustrate, here is an @code{awk} rule which uses our @code{myprint}
+function:
+
+@example
+$3 > 0     @{ myprint($3) @}
+@end example
+
+@noindent
+This program prints, in our special format, all the third fields that
+contain a positive number in our input.  Therefore, when given:
+
+@example
+ 1.2   3.4    5.6   7.8
+ 9.10 11.12 -13.14 15.16
+17.18 19.20  21.22 23.24
+@end example
+
+@noindent
+this program, using our function to format the results, prints:
+
+@example
+   5.6
+  21.2
+@end example
+
+This function deletes all the elements in an array.
+
+@example
+function delarray(a,    i)
+@{
+    for (i in a)
+       delete a[i]
+@}
+@end example
+
+When working with arrays, it is often necessary to delete all the elements
+in an array and start over with a new list of elements
+(@pxref{Delete, ,The @code{delete} Statement}).
+Instead of having
+to repeat this loop everywhere in your program that you need to clear out
+an array, your program can just call @code{delarray}.
+
+Here is an example of a recursive function.  It takes a string
+as an input parameter, and returns the string in backwards order.
+
+@example
+function rev(str, start)
+@{
+    if (start == 0)
+        return ""
+
+    return (substr(str, start, 1) rev(str, start - 1))
+@}
+@end example
+
+If this function is in a file named @file{rev.awk}, we can test it
+this way:
+
+@example
+$ echo "Don't Panic!" |
+> gawk --source '@{ print rev($0, length($0)) @}' -f rev.awk
+@print{} !cinaP t'noD
+@end example
+
+Here is an example that uses the built-in function @code{strftime}.
+(@xref{Time Functions, ,Functions for Dealing with Time Stamps},
+for more information on @code{strftime}.)
+The C @code{ctime} function takes a timestamp and returns it in a string,
+formatted in a well known fashion.  Here is an @code{awk} version:
+
+@example
+@c file eg/lib/ctime.awk
+@group
+# ctime.awk
+#
+# awk version of C ctime(3) function
+
+function ctime(ts,    format)
+@{
+    format = "%a %b %d %H:%M:%S %Z %Y"
+    if (ts == 0)
+        ts = systime()       # use current time as default
+    return strftime(format, ts)
+@}
+@c endfile
+@end group
+@end example
+
+@node Function Caveats, Return Statement, Function Example, User-defined
+@section Calling User-defined Functions
+
+@cindex call by value
+@cindex call by reference
+@cindex calling a function
+@cindex function call
+@dfn{Calling a function} means causing the function to run and do its job.
+A function call is an expression, and its value is the value returned by
+the function.
+
+A function call consists of the function name followed by the arguments
+in parentheses.  What you write in the call for the arguments are
+@code{awk} expressions; each time the call is executed, these
+expressions are evaluated, and the values are the actual arguments.  For
+example, here is a call to @code{foo} with three arguments (the first
+being a string concatenation):
+
+@example
+foo(x y, "lose", 4 * z)
+@end example
+
+@strong{Caution:} whitespace characters (spaces and tabs) are not allowed
+between the function name and the open-parenthesis of the argument list.
+If you write whitespace by mistake, @code{awk} might think that you mean
+to concatenate a variable with an expression in parentheses.  However, it
+notices that you used a function name and not a variable name, and reports
+an error.
+
+@cindex call by value
+When a function is called, it is given a @emph{copy} of the values of
+its arguments.  This is known as @dfn{call by value}.  The caller may use
+a variable as the expression for the argument, but the called function
+does not know this: it only knows what value the argument had.  For
+example, if you write this code:
+
+@example
+foo = "bar"
+z = myfunc(foo)
+@end example
+
+@noindent
+then you should not think of the argument to @code{myfunc} as being
+``the variable @code{foo}.''  Instead, think of the argument as the
+string value, @code{"bar"}.
+
+If the function @code{myfunc} alters the values of its local variables,
+this has no effect on any other variables.  Thus, if @code{myfunc}
+does this:
+
+@example
+@group
+function myfunc(str)
+@{
+  print str
+  str = "zzz"
+  print str
+@}
+@end group
+@end example
+
+@noindent
+to change its first argument variable @code{str}, this @emph{does not}
+change the value of @code{foo} in the caller.  The role of @code{foo} in
+calling @code{myfunc} ended when its value, @code{"bar"}, was computed.
+If @code{str} also exists outside of @code{myfunc}, the function body
+cannot alter this outer value, because it is shadowed during the
+execution of @code{myfunc} and cannot be seen or changed from there.
+
+@cindex call by reference
+However, when arrays are the parameters to functions, they are @emph{not}
+copied.  Instead, the array itself is made available for direct manipulation
+by the function.  This is usually called @dfn{call by reference}.
+Changes made to an array parameter inside the body of a function @emph{are}
+visible outside that function.  
+@ifinfo
+This can be @strong{very} dangerous if you do not watch what you are
+doing.  For example:
+@end ifinfo
+@iftex
+@emph{This can be very dangerous if you do not watch what you are
+doing.}  For example:
+@end iftex
+
+@example
+function changeit(array, ind, nvalue)
+@{
+     array[ind] = nvalue
+@}
+
+BEGIN @{
+    a[1] = 1; a[2] = 2; a[3] = 3
+    changeit(a, 2, "two")
+    printf "a[1] = %s, a[2] = %s, a[3] = %s\n",
+            a[1], a[2], a[3]
+@}
+@end example
+
+@noindent
+This program prints @samp{a[1] = 1, a[2] = two, a[3] = 3}, because
+@code{changeit} stores @code{"two"} in the second element of @code{a}.
+
+@cindex undefined functions
+@cindex functions, undefined
+Some @code{awk} implementations allow you to call a function that
+has not been defined, and only report a problem at run-time when the
+program actually tries to call the function. For example:
+
+@example
+@group
+BEGIN @{
+    if (0)
+        foo()
+    else
+        bar()
+@}
+function bar() @{ @dots{} @}
+# note that `foo' is not defined
+@end group
+@end example
+
+@noindent
+Since the @samp{if} statement will never be true, it is not really a
+problem that @code{foo} has not been defined.  Usually though, it is a
+problem if a program calls an undefined function.
+
+@ignore
+At one point, I had gawk dieing on this, but later decided that this might
+break old programs and/or test suites.
+@end ignore
+
+If @samp{--lint} has been specified
+(@pxref{Options, ,Command Line Options}),
+@code{gawk} will report about calls to undefined functions.
+
+@node Return Statement,  , Function Caveats, User-defined
+@section The @code{return} Statement
+@cindex @code{return} statement
+
+The body of a user-defined function can contain a @code{return} statement.
+This statement returns control to the rest of the @code{awk} program.  It
+can also be used to return a value for use in the rest of the @code{awk}
+program.  It looks like this:
+
+@example
+return @r{[}@var{expression}@r{]}
+@end example
+
+The @var{expression} part is optional.  If it is omitted, then the returned
+value is undefined and, therefore, unpredictable.
+
+A @code{return} statement with no value expression is assumed at the end of
+every function definition.  So if control reaches the end of the function
+body, then the function returns an unpredictable value.  @code{awk}
+will @emph{not} warn you if you use the return value of such a function.
+
+Sometimes, you want to write a function for what it does, not for
+what it returns.  Such a function corresponds to a @code{void} function
+in C or to a @code{procedure} in Pascal.  Thus, it may be appropriate to not
+return any value; you should simply bear in mind that if you use the return
+value of such a function, you do so at your own risk.
+
+Here is an example of a user-defined function that returns a value
+for the largest number among the elements of an array:
+
+@example
+@group
+function maxelt(vec,   i, ret)
+@{
+     for (i in vec) @{
+          if (ret == "" || vec[i] > ret)
+               ret = vec[i]
+     @}
+     return ret
+@}
+@end group
+@end example
+
+@noindent
+You call @code{maxelt} with one argument, which is an array name.  The local
+variables @code{i} and @code{ret} are not intended to be arguments;
+while there is nothing to stop you from passing two or three arguments
+to @code{maxelt}, the results would be strange.  The extra space before
+@code{i} in the function parameter list indicates that @code{i} and
+@code{ret} are not supposed to be arguments.  This is a convention that
+you should follow when you define functions.
+
+Here is a program that uses our @code{maxelt} function.  It loads an
+array, calls @code{maxelt}, and then reports the maximum number in that
+array:
+
+@example
+@group
+awk '
+function maxelt(vec,   i, ret)
+@{
+     for (i in vec) @{
+          if (ret == "" || vec[i] > ret)
+               ret = vec[i]
+     @}
+     return ret
+@}
+@end group
+
+@group
+# Load all fields of each record into nums.
+@{
+     for(i = 1; i <= NF; i++)
+          nums[NR, i] = $i
+@}
+
+END @{
+     print maxelt(nums)
+@}'
+@end group
+@end example
+
+Given the following input:
+
+@example
+@group
+ 1 5 23 8 16
+44 3 5 2 8 26
+256 291 1396 2962 100
+-6 467 998 1101
+99385 11 0 225
+@end group
+@end example
+
+@noindent
+our program tells us (predictably) that @code{99385} is the largest number
+in our array.
+
+@node Invoking Gawk, Library Functions, User-defined, Top
+@chapter Running @code{awk}
+@cindex command line
+@cindex invocation of @code{gawk}
+@cindex arguments, command line
+@cindex options, command line
+@cindex long options
+@cindex options, long
+
+There are two ways to run @code{awk}: with an explicit program, or with
+one or more program files.  Here are templates for both of them; items
+enclosed in @samp{@r{[}@dots{}@r{]}} in these templates are optional.
+
+Besides traditional one-letter POSIX-style options, @code{gawk} also
+supports GNU long options.
+
+@example
+awk @r{[@var{options}]} -f progfile @r{[@code{--}]} @var{file} @dots{}
+awk @r{[@var{options}]} @r{[@code{--}]} '@var{program}' @var{file} @dots{}
+@end example
+
+@cindex empty program
+@cindex dark corner
+It is possible to invoke @code{awk} with an empty program:
+
+@example
+$ awk '' datafile1 datafile2
+@end example
+
+@noindent
+Doing so makes little sense though; @code{awk} will simply exit
+silently when given an empty program (d.c.).  If @samp{--lint} has
+been specified on the command line, @code{gawk} will issue a
+warning that the program is empty.
+
+@menu
+* Options::                     Command line options and their meanings.
+* Other Arguments::             Input file names and variable assignments.
+* AWKPATH Variable::            Searching directories for @code{awk} programs.
+* Obsolete::                    Obsolete Options and/or features.
+* Undocumented::                Undocumented Options and Features.
+* Known Bugs::                  Known Bugs in @code{gawk}.
+@end menu
+
+@node Options, Other Arguments, Invoking Gawk, Invoking Gawk
+@section Command Line Options
+
+Options begin with a dash, and consist of a single character.
+GNU style long options consist of two dashes and a keyword.
+The keyword can be abbreviated, as long the abbreviation allows the option
+to be uniquely identified.  If the option takes an argument, then the
+keyword is either immediately followed by an equals sign (@samp{=}) and the
+argument's value, or the keyword and the argument's value are separated
+by whitespace.  For brevity, the discussion below only refers to the
+traditional short options; however the long and short options are
+interchangeable in all contexts.
+
+Each long option for @code{gawk} has a corresponding
+POSIX-style option.  The options and their meanings are as follows:
+
+@table @code
+@item -F @var{fs}
+@itemx --field-separator @var{fs}
+@cindex @code{-F} option
+@cindex @code{--field-separator} option
+Sets the @code{FS} variable to @var{fs}
+(@pxref{Field Separators, ,Specifying How Fields are Separated}).
+
+@item -f @var{source-file}
+@itemx --file @var{source-file}
+@cindex @code{-f} option
+@cindex @code{--file} option
+Indicates that the @code{awk} program is to be found in @var{source-file}
+instead of in the first non-option argument.
+
+@item -v @var{var}=@var{val}
+@itemx --assign @var{var}=@var{val}
+@cindex @code{-v} option
+@cindex @code{--assign} option
+Sets the variable @var{var} to the value @var{val} @strong{before}
+execution of the program begins.  Such variable values are available
+inside the @code{BEGIN} rule
+(@pxref{Other Arguments, ,Other Command Line Arguments}).
+
+The @samp{-v} option can only set one variable, but you can use
+it more than once, setting another variable each time, like this:
+@samp{awk @w{-v foo=1} @w{-v bar=2} @dots{}}.
+
+@item -mf=@var{NNN}
+@itemx -mr=@var{NNN}
+Set various memory limits to the value @var{NNN}.  The @samp{f} flag sets
+the maximum number of fields, and the @samp{r} flag sets the maximum
+record size.  These two flags and the @samp{-m} option are from the
+Bell Labs research version of Unix @code{awk}.  They are provided
+for compatibility, but otherwise ignored by
+@code{gawk}, since @code{gawk} has no predefined limits.
+
+@item -W @var{gawk-opt}
+@cindex @code{-W} option
+Following the POSIX standard, options that are implementation
+specific are supplied as arguments to the @samp{-W} option.  With @code{gawk},
+these arguments may be separated by commas, or quoted and separated by
+whitespace.  Case is ignored when processing these options.  These options
+also have corresponding GNU style long options.
+See below.
+
+@item --
+Signals the end of the command line options.  The following arguments
+are not treated as options even if they begin with @samp{-}.  This
+interpretation of @samp{--} follows the POSIX argument parsing
+conventions.
+
+This is useful if you have file names that start with @samp{-},
+or in shell scripts, if you have file names that will be specified
+by the user which could start with @samp{-}.
+@end table
+
+The following @code{gawk}-specific options are available:
+
+@table @code
+@item -W traditional
+@itemx -W compat
+@itemx --traditional
+@itemx --compat
+@cindex @code{--compat} option
+@cindex @code{--traditional} option
+@cindex compatibility mode
+Specifies @dfn{compatibility mode}, in which the GNU extensions to
+the @code{awk} language are disabled, so that @code{gawk} behaves just
+like the Bell Labs research version of Unix @code{awk}.
+@samp{--traditional} is the preferred form of this option.
+@xref{POSIX/GNU, ,Extensions in @code{gawk} Not in POSIX @code{awk}},
+which summarizes the extensions.  Also see
+@ref{Compatibility Mode, ,Downward Compatibility and Debugging}.
+
+@item -W copyleft
+@itemx -W copyright
+@itemx --copyleft
+@itemx --copyright
+@cindex @code{--copyleft} option
+@cindex @code{--copyright} option
+Print the short version of the General Public License.
+This option may disappear in a future version of @code{gawk}.  
+
+@item -W help
+@itemx -W usage
+@itemx --help
+@itemx --usage
+@cindex @code{--help} option
+@cindex @code{--usage} option
+Print a ``usage'' message summarizing the short and long style options
+that @code{gawk} accepts, and then exit.
+
+@item -W lint
+@itemx --lint
+@cindex @code{--lint} option
+Warn about constructs that are dubious or non-portable to
+other @code{awk} implementations.
+Some warnings are issued when @code{gawk} first reads your program.  Others
+are issued at run-time, as your program executes.
+
+@item -W lint-old
+@itemx --lint-old
+@cindex @code{--lint-old} option
+Warn about constructs that are not available in
+the original Version 7 Unix version of @code{awk}
+(@pxref{V7/SVR3.1, , Major Changes between V7 and SVR3.1}).
+
+@item -W posix
+@itemx --posix
+@cindex @code{--posix} option
+@cindex POSIX mode
+Operate in strict POSIX mode.  This disables all @code{gawk}
+extensions (just like @samp{--traditional}), and adds the following additional
+restrictions:
+
+@c IMPORTANT! Keep this list in sync with the one in node POSIX
+
+@itemize @bullet
+@item
+@code{\x} escape sequences are not recognized
+(@pxref{Escape Sequences}).
+
+@item
+The synonym @code{func} for the keyword @code{function} is not
+recognized (@pxref{Definition Syntax, ,Function Definition Syntax}).
+
+@item
+The operators @samp{**} and @samp{**=} cannot be used in
+place of @samp{^} and @samp{^=} (@pxref{Arithmetic Ops, ,Arithmetic Operators},
+and also @pxref{Assignment Ops, ,Assignment Expressions}).
+
+@item
+Specifying @samp{-Ft} on the command line does not set the value
+of @code{FS} to be a single tab character
+(@pxref{Field Separators, ,Specifying How Fields are Separated}).
+
+@item
+The @code{fflush} built-in function is not supported
+(@pxref{I/O Functions, , Built-in Functions for Input/Output}).
+@end itemize
+
+If you supply both @samp{--traditional} and @samp{--posix} on the
+command line, @samp{--posix} will take precedence. @code{gawk}
+will also issue a warning if both options are supplied.
+
+@item -W re-interval
+@itemx --re-interval
+Allow interval expressions
+(@pxref{Regexp Operators, , Regular Expression Operators}),
+in regexps.
+Because interval expressions were traditionally not available in @code{awk},
+@code{gawk} does not provide them by default. This prevents old @code{awk}
+programs from breaking.
+
+@item -W source @var{program-text}
+@itemx --source @var{program-text}
+@cindex @code{--source} option
+Program source code is taken from the @var{program-text}.  This option
+allows you to mix source code in files with source
+code that you enter on the command line. This is particularly useful
+when you have library functions that you wish to use from your command line
+programs (@pxref{AWKPATH Variable, ,The @code{AWKPATH} Environment Variable}).
+
+@item -W version
+@itemx --version
+@cindex @code{--version} option
+Prints version information for this particular copy of @code{gawk}.
+This allows you to determine if your copy of @code{gawk} is up to date
+with respect to whatever the Free Software Foundation is currently
+distributing.
+It is also useful for bug reports
+(@pxref{Bugs,  , Reporting Problems and Bugs}).
+@end table
+
+Any other options are flagged as invalid with a warning message, but
+are otherwise ignored.
+
+In compatibility mode, as a special case, if the value of @var{fs} supplied
+to the @samp{-F} option is @samp{t}, then @code{FS} is set to the tab
+character (@code{"\t"}).  This is only true for @samp{--traditional}, and not
+for @samp{--posix}
+(@pxref{Field Separators, ,Specifying How Fields are Separated}).
+
+The @samp{-f} option may be used more than once on the command line.
+If it is, @code{awk} reads its program source from all of the named files, as
+if they had been concatenated together into one big file.  This is
+useful for creating libraries of @code{awk} functions.  Useful functions
+can be written once, and then retrieved from a standard place, instead
+of having to be included into each individual program.
+
+You can type in a program at the terminal and still use library functions,
+by specifying @samp{-f /dev/tty}.  @code{awk} will read a file from the terminal
+to use as part of the @code{awk} program.  After typing your program,
+type @kbd{Control-d} (the end-of-file character) to terminate it.
+(You may also use @samp{-f -} to read program source from the standard
+input, but then you will not be able to also use the standard input as a
+source of data.)
+
+Because it is clumsy using the standard @code{awk} mechanisms to mix source
+file and command line @code{awk} programs, @code{gawk} provides the
+@samp{--source} option.  This does not require you to pre-empt the standard
+input for your source code, and allows you to easily mix command line
+and library source code
+(@pxref{AWKPATH Variable, ,The @code{AWKPATH} Environment Variable}).
+
+If no @samp{-f} or @samp{--source} option is specified, then @code{gawk}
+will use the first non-option command line argument as the text of the
+program source code.
+
+@cindex @code{POSIXLY_CORRECT} environment variable
+@cindex environment variable, @code{POSIXLY_CORRECT}
+If the environment variable @code{POSIXLY_CORRECT} exists,
+then @code{gawk} will behave in strict POSIX mode, exactly as if
+you had supplied the @samp{--posix} command line option.
+Many GNU programs look for this environment variable to turn on
+strict POSIX mode. If you supply @samp{--lint} on the command line,
+and @code{gawk} turns on POSIX mode because of @code{POSIXLY_CORRECT},
+then it will print a warning message indicating that POSIX
+mode is in effect.
+
+You would typically set this variable in your shell's startup file.
+For a Bourne compatible shell (such as Bash), you would add these
+lines to the @file{.profile} file in your home directory.
+
+@example
+@group
+POSIXLY_CORRECT=true
+export POSIXLY_CORRECT
+@end group
+@end example
+
+For a @code{csh} compatible shell,@footnote{Not recommended.}
+you would add this line to the @file{.login} file in your home directory.
+
+@example
+setenv POSIXLY_CORRECT true
+@end example
+
+@node Other Arguments, AWKPATH Variable, Options, Invoking Gawk
+@section Other Command Line Arguments
+
+Any additional arguments on the command line are normally treated as
+input files to be processed in the order specified.   However, an
+argument that has the form @code{@var{var}=@var{value}}, assigns
+the value @var{value} to the variable @var{var}---it does not specify a
+file at all.
+
+@vindex ARGIND
+@vindex ARGV
+All these arguments are made available to your @code{awk} program in the
+@code{ARGV} array (@pxref{Built-in Variables}).  Command line options
+and the program text (if present) are omitted from @code{ARGV}.
+All other arguments, including variable assignments, are
+included.   As each element of @code{ARGV} is processed, @code{gawk}
+sets the variable @code{ARGIND} to the index in @code{ARGV} of the
+current element.
+
+The distinction between file name arguments and variable-assignment
+arguments is made when @code{awk} is about to open the next input file.
+At that point in execution, it checks the ``file name'' to see whether
+it is really a variable assignment; if so, @code{awk} sets the variable
+instead of reading a file.
+
+Therefore, the variables actually receive the given values after all
+previously specified files have been read.  In particular, the values of
+variables assigned in this fashion are @emph{not} available inside a
+@code{BEGIN} rule
+(@pxref{BEGIN/END, ,The @code{BEGIN} and @code{END} Special Patterns}),
+since such rules are run before @code{awk} begins scanning the argument list.
+
+@cindex dark corner
+The variable values given on the command line are processed for escape
+sequences (d.c.) (@pxref{Escape Sequences}).
+
+In some earlier implementations of @code{awk}, when a variable assignment
+occurred before any file names, the assignment would happen @emph{before}
+the @code{BEGIN} rule was executed.  @code{awk}'s behavior was thus
+inconsistent; some command line assignments were available inside the
+@code{BEGIN} rule, while others were not.  However,
+some applications came to depend
+upon this ``feature.''  When @code{awk} was changed to be more consistent,
+the @samp{-v} option was added to accommodate applications that depended
+upon the old behavior.
+
+The variable assignment feature is most useful for assigning to variables
+such as @code{RS}, @code{OFS}, and @code{ORS}, which control input and
+output formats, before scanning the data files.  It is also useful for
+controlling state if multiple passes are needed over a data file.  For
+example:
+
+@cindex multiple passes over data
+@cindex passes, multiple
+@example
+awk 'pass == 1  @{ @var{pass 1 stuff} @}
+     pass == 2  @{ @var{pass 2 stuff} @}' pass=1 mydata pass=2 mydata
+@end example
+
+Given the variable assignment feature, the @samp{-F} option for setting
+the value of @code{FS} is not
+strictly necessary.  It remains for historical compatibility.
+
+@node AWKPATH Variable, Obsolete, Other Arguments, Invoking Gawk
+@section The @code{AWKPATH} Environment Variable
+@cindex @code{AWKPATH} environment variable
+@cindex environment variable, @code{AWKPATH}
+@cindex search path
+@cindex directory search
+@cindex path, search
+@cindex differences between @code{gawk} and @code{awk}
+
+The previous section described how @code{awk} program files can be named
+on the command line with the @samp{-f} option.  In most @code{awk}
+implementations, you must supply a precise path name for each program
+file, unless the file is in the current directory.
+
+@cindex search path, for source files
+But in @code{gawk}, if the file name supplied to the @samp{-f} option
+does not contain a @samp{/}, then @code{gawk} searches a list of
+directories (called the @dfn{search path}), one by one, looking for a
+file with the specified name.
+
+The search path is a string consisting of directory names
+separated by colons.  @code{gawk} gets its search path from the
+@code{AWKPATH} environment variable.  If that variable does not exist,
+@code{gawk} uses a default path, which is
+@samp{.:/usr/local/share/awk}.@footnote{Your version of @code{gawk}
+may use a directory that is different than @file{/usr/local/share/awk}; it
+will depend upon how @code{gawk} was built and installed. The actual
+directory will be the value of @samp{$(datadir)} generated when
+@code{gawk} was configured.  You probably don't need to worry about this
+though.} (Programs written for use by
+system administrators should use an @code{AWKPATH} variable that
+does not include the current directory, @file{.}.)
+
+The search path feature is particularly useful for building up libraries
+of useful @code{awk} functions.  The library files can be placed in a
+standard directory that is in the default path, and then specified on
+the command line with a short file name.  Otherwise, the full file name
+would have to be typed for each file.
+
+By using both the @samp{--source} and @samp{-f} options, your command line
+@code{awk} programs can use facilities in @code{awk} library files.
+@xref{Library Functions, , A Library of @code{awk} Functions}.
+
+Path searching is not done if @code{gawk} is in compatibility mode.
+This is true for both @samp{--traditional} and @samp{--posix}.
+@xref{Options, ,Command Line Options}.
+
+@strong{Note:} if you want files in the current directory to be found,
+you must include the current directory in the path, either by including
+@file{.} explicitly in the path, or by writing a null entry in the
+path.  (A null entry is indicated by starting or ending the path with a
+colon, or by placing two colons next to each other (@samp{::}).)  If the
+current directory is not included in the path, then files cannot be
+found in the current directory.  This path search mechanism is identical
+to the shell's.
+@c someday, @cite{The Bourne Again Shell}....
+
+Starting with version 3.0, if @code{AWKPATH} is not defined in the
+environment, @code{gawk} will place its default search path into
+@code{ENVIRON["AWKPATH"]}. This makes it easy to determine
+the actual search path @code{gawk} will use.
+
+@node Obsolete, Undocumented, AWKPATH Variable, Invoking Gawk
+@section Obsolete Options and/or Features
+
+@cindex deprecated options
+@cindex obsolete options
+@cindex deprecated features
+@cindex obsolete features
+This section describes features and/or command line options from
+previous releases of @code{gawk} that are either not available in the
+current version, or that are still supported but deprecated (meaning that
+they will @emph{not} be in the next release).
+
+@c update this section for each release!
+
+For version @value{VERSION} of @code{gawk}, there are no command line options
+or other deprecated features from the previous version of @code{gawk}.
+@iftex
+This section
+@end iftex
+@ifinfo
+This node
+@end ifinfo
+is thus essentially a place holder,
+in case some option becomes obsolete in a future version of @code{gawk}.
+
+@ignore
+@c This is pretty old news...
+The public-domain version of @code{strftime} that is distributed with
+@code{gawk} changed for the 2.14 release.  The @samp{%V} conversion specifier
+that used to generate the date in VMS format was changed to @samp{%v}.
+This is because the POSIX standard for the @code{date} utility now
+specifies a @samp{%V} conversion specifier.
+@xref{Time Functions, ,Functions for Dealing with Time Stamps}, for details.
+@end ignore
+
+@node Undocumented, Known Bugs, Obsolete, Invoking Gawk
+@section Undocumented Options and Features
+@cindex undocumented features
+
+This section intentionally left blank.
+
+@c Read The Source, Luke!
+
+@ignore
+@c If these came out in the Info file or TeX document, then they wouldn't
+@c be undocumented, would they?
+
+@code{gawk} has one undocumented option:
+
+@table @code
+@item -W nostalgia
+@itemx --nostalgia
+Print the message @code{"awk: bailing out near line 1"} and dump core.
+This option was inspired by the common behavior of very early versions of
+Unix @code{awk}, and by a t--shirt.
+@end table
+
+Early versions of @code{awk} used to not require any separator (either
+a newline or @samp{;}) between the rules in @code{awk} programs.  Thus,
+it was common to see one-line programs like:
+
+@example
+awk '@{ sum += $1 @} END @{ print sum @}'
+@end example
+
+@code{gawk} actually supports this, but it is purposely undocumented
+since it is considered bad style.  The correct way to write such a program
+is either
+
+@example
+awk '@{ sum += $1 @} ; END @{ print sum @}'
+@end example
+
+@noindent
+or
+
+@example
+awk '@{ sum += $1 @}
+     END @{ print sum @}' data
+@end example
+
+@noindent
+@xref{Statements/Lines, ,@code{awk} Statements Versus Lines}, for a fuller
+explanation.
+
+@end ignore
+
+@node Known Bugs, , Undocumented, Invoking Gawk
+@section Known Bugs in @code{gawk}
+@cindex bugs, known in @code{gawk}
+@cindex known bugs
+
+@itemize @bullet
+@item
+The @samp{-F} option for changing the value of @code{FS}
+(@pxref{Options, ,Command Line Options})
+is not necessary given the command line variable
+assignment feature; it remains only for backwards compatibility.
+
+@item
+If your system actually has support for @file{/dev/fd} and the
+associated @file{/dev/stdin}, @file{/dev/stdout}, and
+@file{/dev/stderr} files, you may get different output from @code{gawk}
+than you would get on a system without those files.  When @code{gawk}
+interprets these files internally, it synchronizes output to the
+standard output with output to @file{/dev/stdout}, while on a system
+with those files, the output is actually to different open files
+(@pxref{Special Files, ,Special File Names in @code{gawk}}).
+
+@item
+Syntactically invalid single character programs tend to overflow
+the parse stack, generating a rather unhelpful message.  Such programs
+are surprisingly difficult to diagnose in the completely general case,
+and the effort to do so really is not worth it.
+
+@item
+The word ``GNU'' is incorrectly capitalized in at least one
+file in the source code.
+@end itemize
+
+@node Library Functions, Sample Programs, Invoking Gawk, Top
+@chapter A Library of @code{awk} Functions
+
+@c 2e: USE TEXINFO-2 FUNCTION DEFINITION STUFF!!!!!!!!!!!!!
+This chapter presents a library of useful @code{awk} functions.  The
+sample programs presented later
+(@pxref{Sample Programs, ,Practical @code{awk} Programs})
+use these functions.
+The functions are presented here in a progression from simple to complex.
+
+@ref{Extract Program, ,Extracting Programs from Texinfo Source Files},
+presents a program that you can use to extract the source code for
+these example library functions and programs from the Texinfo source
+for this @value{DOCUMENT}.
+(This has already been done as part of the @code{gawk} distribution.)
+
+If you have written one or more useful, general purpose @code{awk} functions,
+and would like to contribute them for a subsequent edition of this @value{DOCUMENT},
+please contact the author.  @xref{Bugs, ,Reporting Problems and Bugs},
+for information on doing this.  Don't just send code, as you will be
+required to either place your code in the public domain,
+publish it under the GPL (@pxref{Copying, ,GNU GENERAL PUBLIC LICENSE}),
+or assign the copyright in it to the Free Software Foundation.
+
+@menu
+* Portability Notes::           What to do if you don't have @code{gawk}.
+* Nextfile Function::           Two implementations of a @code{nextfile}
+                                function.
+* Assert Function::             A function for assertions in @code{awk}
+                                programs.
+* Ordinal Functions::           Functions for using characters as numbers and
+                                vice versa.
+* Join Function::               A function to join an array into a string.
+* Mktime Function::             A function to turn a date into a timestamp.
+* Gettimeofday Function::       A function to get formatted times.
+* Filetrans Function::          A function for handling data file transitions.
+* Getopt Function::             A function for processing command line
+                                arguments.
+* Passwd Functions::            Functions for getting user information.
+* Group Functions::             Functions for getting group information.
+* Library Names::               How to best name private global variables in
+                                library functions.
+@end menu
+
+@node Portability Notes, Nextfile Function, Library Functions, Library Functions
+@section Simulating @code{gawk}-specific Features
+@cindex portability issues
+
+The programs in this chapter and in
+@ref{Sample Programs, ,Practical @code{awk} Programs},
+freely use features that are specific to @code{gawk}.
+This section briefly discusses how you can rewrite these programs for
+different implementations of @code{awk}.
+
+Diagnostic error messages are sent to @file{/dev/stderr}.
+Use @samp{| "cat 1>&2"} instead of @samp{> "/dev/stderr"}, if your system
+does not have a @file{/dev/stderr}, or if you cannot use @code{gawk}.
+
+A number of programs use @code{nextfile}
+(@pxref{Nextfile Statement, ,The @code{nextfile} Statement}),
+to skip any remaining input in the input file.
+@ref{Nextfile Function, ,Implementing @code{nextfile} as a Function},
+shows you how to write a function that will do the same thing.
+
+Finally, some of the programs choose to ignore upper-case and lower-case
+distinctions in their input. They do this by assigning one to @code{IGNORECASE}.
+You can achieve the same effect by adding the following rule to the
+beginning of the program:
+
+@example
+# ignore case
+@{ $0 = tolower($0) @}
+@end example
+
+@noindent
+Also, verify that all regexp and string constants used in
+comparisons only use lower-case letters.
+
+@node Nextfile Function, Assert Function, Portability Notes, Library Functions
+@section Implementing @code{nextfile} as a Function
+
+@cindex skipping input files
+@cindex input files, skipping
+The @code{nextfile} statement presented in
+@ref{Nextfile Statement, ,The @code{nextfile} Statement},
+is a @code{gawk}-specific extension.  It is not available in other
+implementations of @code{awk}.  This section shows two versions of a
+@code{nextfile} function that you can use to simulate @code{gawk}'s
+@code{nextfile} statement if you cannot use @code{gawk}.
+
+Here is a first attempt at writing a @code{nextfile} function.
+
+@example
+@group
+# nextfile --- skip remaining records in current file
+
+# this should be read in before the "main" awk program
+
+function nextfile()    @{ _abandon_ = FILENAME; next @}
+
+_abandon_ == FILENAME  @{ next @}
+@end group
+@end example
+
+This file should be included before the main program, because it supplies
+a rule that must be executed first.  This rule compares the current data
+file's name (which is always in the @code{FILENAME} variable) to a private
+variable named @code{_abandon_}.  If the file name matches, then the action
+part of the rule executes a @code{next} statement, to go on to the next
+record.  (The use of @samp{_} in the variable name is a convention.
+It is discussed more fully in
+@ref{Library Names,  , Naming Library Function Global Variables}.)
+
+The use of the @code{next} statement effectively creates a loop that reads
+all the records from the current data file.
+Eventually, the end of the file is reached, and
+a new data file is opened, changing the value of @code{FILENAME}.
+Once this happens, the comparison of @code{_abandon_} to @code{FILENAME}
+fails, and execution continues with the first rule of the ``real'' program.
+
+The @code{nextfile} function itself simply sets the value of @code{_abandon_}
+and then executes a @code{next} statement to start the loop
+going.@footnote{Some implementations of @code{awk} do not allow you to
+execute @code{next} from within a function body. Some other work-around
+will be necessary if you use such a version.}
+@c mawk is what we're talking about.
+
+This initial version has a subtle problem.  What happens if the same data
+file is listed @emph{twice} on the command line, one right after the other,
+or even with just a variable assignment between the two occurrences of
+the file name?
+
+@c @findex nextfile
+@c do it this way, since all the indices are merged
+@cindex @code{nextfile} function
+In such a case,
+this code will skip right through the file, a second time, even though
+it should stop when it gets to the end of the first occurrence.
+Here is a second version of @code{nextfile} that remedies this problem.
+
+@example
+@group
+@c file eg/lib/nextfile.awk
+# nextfile --- skip remaining records in current file
+# correctly handle successive occurrences of the same file
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, Public Domain
+# May, 1993
+
+# this should be read in before the "main" awk program
+
+function nextfile()   @{ _abandon_ = FILENAME; next @}
+
+_abandon_ == FILENAME @{
+      if (FNR == 1)
+          _abandon_ = ""
+      else
+          next
+@}
+@c endfile
+@end group
+@end example
+
+The @code{nextfile} function has not changed.  It sets @code{_abandon_}
+equal to the current file name and then executes a @code{next} satement.
+The @code{next} statement reads the next record and increments @code{FNR},
+so @code{FNR} is guaranteed to have a value of at least two.
+However, if @code{nextfile} is called for the last record in the file,
+then @code{awk} will close the current data file and move on to the next
+one.  Upon doing so, @code{FILENAME} will be set to the name of the new file,
+and @code{FNR} will be reset to one.  If this next file is the same as
+the previous one, @code{_abandon_} will still be equal to @code{FILENAME}.
+However, @code{FNR} will be equal to one, telling us that this is a new
+occurrence of the file, and not the one we were reading when the
+@code{nextfile} function was executed.  In that case, @code{_abandon_}
+is reset to the empty string, so that further executions of this rule
+will fail (until the next time that @code{nextfile} is called).
+
+If @code{FNR} is not one, then we are still in the original data file,
+and the program executes a @code{next} statement to skip through it.
+
+An important question to ask at this point is: ``Given that the
+functionality of @code{nextfile} can be provided with a library file,
+why is it built into @code{gawk}?''  This is an important question.  Adding
+features for little reason leads to larger, slower programs that are
+harder to maintain.
+
+The answer is that building @code{nextfile} into @code{gawk} provides
+significant gains in efficiency.  If the @code{nextfile} function is executed
+at the beginning of a large data file, @code{awk} still has to scan the entire
+file, splitting it up into records, just to skip over it.  The built-in
+@code{nextfile} can simply close the file immediately and proceed to the
+next one, saving a lot of time.  This is particularly important in
+@code{awk}, since @code{awk} programs are generally I/O bound (i.e.@:
+they spend most of their time doing input and output, instead of performing
+computations).
+
+@node Assert Function, Ordinal Functions, Nextfile Function, Library Functions
+@section Assertions
+
+@cindex assertions
+@cindex @code{assert}, C version
+When writing large programs, it is often useful to be able to know
+that a condition or set of conditions is true.  Before proceeding with a
+particular computation, you make a statement about what you believe to be
+the case.  Such a statement is known as an
+``assertion.''  The C language provides an @code{<assert.h>} header file
+and corresponding @code{assert} macro that the programmer can use to make
+assertions.  If an assertion fails, the @code{assert} macro arranges to
+print a diagnostic message describing the condition that should have
+been true but was not, and then it kills the program.  In C, using
+@code{assert} looks this:
+
+@example
+#include <assert.h>
+
+int myfunc(int a, double b)
+@{
+     assert(a <= 5 && b >= 17);
+     @dots{}
+@}
+@end example
+
+If the assertion failed, the program would print a message similar to
+this:
+
+@example
+prog.c:5: assertion failed: a <= 5 && b >= 17
+@end example
+
+@findex assert
+The ANSI C language makes it possible to turn the condition into a string for use
+in printing the diagnostic message.  This is not possible in @code{awk}, so
+this @code{assert} function also requires a string version of the condition
+that is being tested.
+
+@example
+@c @group
+@c file eg/lib/assert.awk
+# assert --- assert that a condition is true. Otherwise exit.
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, Public Domain
+# May, 1993
+
+function assert(condition, string)
+@{
+    if (! condition) @{
+        printf("%s:%d: assertion failed: %s\n",
+            FILENAME, FNR, string) > "/dev/stderr"
+        _assert_exit = 1
+        exit 1
+    @}
+@}
+
+END @{
+    if (_assert_exit)
+        exit 1
+@}
+@c endfile
+@c @end group
+@end example
+
+The @code{assert} function tests the @code{condition} parameter. If it
+is false, it prints a message to standard error, using the @code{string}
+parameter to describe the failed condition.  It then sets the variable
+@code{_assert_exit} to one, and executes the @code{exit} statement.
+The @code{exit} statement jumps to the @code{END} rule. If the @code{END}
+rules finds @code{_assert_exit} to be true, then it exits immediately.
+
+The purpose of the @code{END} rule with its test is to
+keep any other @code{END} rules from running.  When an assertion fails, the
+program should exit immediately.
+If no assertions fail, then @code{_assert_exit} will still be
+false when the @code{END} rule is run normally, and the rest of the
+program's @code{END} rules will execute.
+For all of this to work correctly, @file{assert.awk} must be the
+first source file read by @code{awk}.
+
+You would use this function in your programs this way:
+
+@example
+function myfunc(a, b)
+@{
+     assert(a <= 5 && b >= 17, "a <= 5 && b >= 17")
+     @dots{}
+@}
+@end example
+
+@noindent
+If the assertion failed, you would see a message like this:
+
+@example
+mydata:1357: assertion failed: a <= 5 && b >= 17
+@end example
+
+There is a problem with this version of @code{assert}, that it may not
+be possible to work around.  An @code{END} rule is automatically added
+to the program calling @code{assert}.  Normally, if a program consists
+of just a @code{BEGIN} rule, the input files and/or standard input are
+not read. However, now that the program has an @code{END} rule, @code{awk}
+will attempt to read the input data files, or standard input
+(@pxref{Using BEGIN/END, , Startup and Cleanup Actions}),
+most likely causing the program to hang, waiting for input.
+
+@cindex backslash continuation
+Just a note on programming style. You may have noticed that the @code{END}
+rule uses backslash continuation, with the open brace on a line by
+itself.  This is so that it more closely resembles the way functions
+are written.  Many of the examples
+@iftex
+in this chapter and the next one
+@end iftex
+use this style. You can decide for yourself if you like writing
+your @code{BEGIN} and @code{END} rules this way,
+or not.
+
+@node Ordinal Functions, Join Function, Assert Function, Library Functions
+@section Translating Between Characters and Numbers
+
+@cindex numeric character values
+@cindex values of characters as numbers
+One commercial implementation of @code{awk} supplies a built-in function,
+@code{ord}, which takes a character and returns the numeric value for that
+character in the machine's character set.  If the string passed to
+@code{ord} has more than one character, only the first one is used.
+
+The inverse of this function is @code{chr} (from the function of the same
+name in Pascal), which takes a number and returns the corresponding character.
+
+Both functions can be written very nicely in @code{awk}; there is no real
+reason to build them into the @code{awk} interpreter.
+
+@findex ord
+@findex chr
+@example
+@c @group
+@c file eg/lib/ord.awk
+# ord.awk --- do ord and chr
+#
+# Global identifiers:
+#    _ord_:        numerical values indexed by characters
+#    _ord_init:    function to initialize _ord_
+#
+# Arnold Robbins
+# arnold@@gnu.ai.mit.edu
+# Public Domain
+# 16 January, 1992
+# 20 July, 1992, revised
+
+BEGIN    @{ _ord_init() @}
+@c endfile
+@c @end group
+
+@c @group
+@c file eg/lib/ord.awk
+function _ord_init(    low, high, i, t)
+@{
+    low = sprintf("%c", 7) # BEL is ascii 7
+    if (low == "\a") @{    # regular ascii
+        low = 0
+        high = 127
+    @} else if (sprintf("%c", 128 + 7) == "\a") @{
+        # ascii, mark parity
+        low = 128
+        high = 255
+    @} else @{        # ebcdic(!)
+        low = 0
+        high = 255
+    @}
+
+    for (i = low; i <= high; i++) @{
+        t = sprintf("%c", i)
+        _ord_[t] = i
+    @}
+@}
+@c endfile
+@c @end group
+@end example
+
+@cindex character sets
+@cindex character encodings
+@cindex ASCII
+@cindex EBCDIC
+@cindex mark parity
+Some explanation of the numbers used by @code{chr} is worthwhile.
+The most prominent character set in use today is ASCII. Although an
+eight-bit byte can hold 256 distinct values (from zero to 255), ASCII only
+defines characters that use the values from zero to 127.@footnote{ASCII
+has been extended in many countries to use the values from 128 to 255
+for country-specific characters.  If your  system uses these extensions,
+you can simplify @code{_ord_init} to simply loop from zero to 255.}
+At least one computer manufacturer that we know of
+@c Pr1me, blech
+uses ASCII, but with mark parity, meaning that the leftmost bit in the byte
+is always one.  What this means is that on those systems, characters
+have numeric values from 128 to 255.
+Finally, large mainframe systems use the EBCDIC character set, which
+uses all 256 values.
+While there are other character sets in use on some older systems,
+they are not really worth worrying about.
+
+@example
+@group
+@c file eg/lib/ord.awk
+function ord(str,    c)
+@{
+    # only first character is of interest
+    c = substr(str, 1, 1)
+    return _ord_[c]
+@}
+@c endfile
+@end group
+
+@group
+@c file eg/lib/ord.awk
+function chr(c)
+@{
+    # force c to be numeric by adding 0
+    return sprintf("%c", c + 0)
+@}
+@c endfile
+@end group
+
+@c @group
+@c file eg/lib/ord.awk
+#### test code ####
+# BEGIN    \
+# @{
+#    for (;;) @{
+#        printf("enter a character: ")
+#        if (getline var <= 0)
+#            break
+#        printf("ord(%s) = %d\n", var, ord(var))
+#    @}
+# @}
+@c endfile
+@c @end group
+@end example
+
+An obvious improvement to these functions would be to move the code for the
+@code{@w{_ord_init}} function into the body of the @code{BEGIN} rule.  It was
+written this way initially for ease of development.
+
+There is a ``test program'' in a @code{BEGIN} rule, for testing the
+function.  It is commented out for production use.
+
+@node Join Function, Mktime Function, Ordinal Functions, Library Functions
+@section Merging an Array Into a String
+
+@cindex merging strings
+When doing string processing, it is often useful to be able to join
+all the strings in an array into one long string.  The following function,
+@code{join}, accomplishes this task.  It is used later in several of
+the application programs
+(@pxref{Sample Programs, ,Practical @code{awk} Programs}).
+
+Good function design is important; this function needs to be general, but it
+should also have a reasonable default behavior.  It is called with an array
+and the beginning and ending indices of the elements in the array to be
+merged.  This assumes that the array indices are numeric---a reasonable
+assumption since the array was likely created with @code{split}
+(@pxref{String Functions, ,Built-in Functions for String Manipulation}).
+
+@findex join
+@example
+@group
+@c file eg/lib/join.awk
+# join.awk --- join an array into a string
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, Public Domain
+# May 1993
+
+function join(array, start, end, sep,    result, i)
+@{
+    if (sep == "")
+       sep = " "
+    else if (sep == SUBSEP) # magic value
+       sep = ""
+    result = array[start]
+    for (i = start + 1; i <= end; i++)
+        result = result sep array[i]
+    return result
+@}
+@c endfile
+@end group
+@end example
+
+An optional additional argument is the separator to use when joining the
+strings back together.  If the caller supplies a non-empty value,
+@code{join} uses it.  If it is not supplied, it will have a null
+value.  In this case, @code{join} uses a single blank as a default
+separator for the strings.  If the value is equal to @code{SUBSEP},
+then @code{join} joins the strings with no separator between them.
+@code{SUBSEP} serves as a ``magic'' value to indicate that there should
+be no separation between the component strings.
+
+It would be nice if @code{awk} had an assignment operator for concatenation.
+The lack of an explicit operator for concatenation makes string operations
+more difficult than they really need to be.
+
+@node Mktime Function, Gettimeofday Function, Join Function, Library Functions
+@section Turning Dates Into Timestamps
+
+The @code{systime} function built in to @code{gawk}
+returns the current time of day as
+a timestamp in ``seconds since the Epoch.''  This timestamp
+can be converted into a printable date of almost infinitely variable
+format using the built-in @code{strftime} function.
+(For more information on @code{systime} and @code{strftime},
+@pxref{Time Functions, ,Functions for Dealing with Time Stamps}.)
+
+@cindex converting dates to timestamps
+@cindex dates, converting to timestamps
+@cindex timestamps, converting from dates
+An interesting but difficult problem is to convert a readable representation
+of a date back into a timestamp.  The ANSI C library provides a @code{mktime}
+function that does the basic job, converting a canonical representation of a
+date into a timestamp.
+
+It would appear at first glance that @code{gawk} would have to supply a
+@code{mktime} built-in function that was simply a ``hook'' to the C language
+version.  In fact though, @code{mktime} can be implemented entirely in
+@code{awk}.
+
+Here is a version of @code{mktime} for @code{awk}.  It takes a simple
+representation of the date and time, and converts it into a timestamp.
+
+The code is presented here intermixed with explanatory prose.  In
+@ref{Extract Program, ,Extracting Programs from Texinfo Source Files},
+you will see how the Texinfo source file for this @value{DOCUMENT}
+can be processed to extract the code into a single source file.
+
+The program begins with a descriptive comment and a @code{BEGIN} rule
+that initializes a table @code{_tm_months}.  This table is a two-dimensional
+array that has the lengths of the months.  The first index is zero for
+regular years, and one for leap years.  The values are the same for all the
+months in both kinds of years, except for February; thus the use of multiple
+assignment.
+
+@example
+@c @group
+@c file eg/lib/mktime.awk
+# mktime.awk --- convert a canonical date representation
+#                into a timestamp
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, Public Domain
+# May 1993
+
+BEGIN    \
+@{
+    # Initialize table of month lengths
+    _tm_months[0,1] = _tm_months[1,1] = 31
+    _tm_months[0,2] = 28; _tm_months[1,2] = 29
+    _tm_months[0,3] = _tm_months[1,3] = 31
+    _tm_months[0,4] = _tm_months[1,4] = 30
+    _tm_months[0,5] = _tm_months[1,5] = 31
+    _tm_months[0,6] = _tm_months[1,6] = 30
+    _tm_months[0,7] = _tm_months[1,7] = 31
+    _tm_months[0,8] = _tm_months[1,8] = 31
+    _tm_months[0,9] = _tm_months[1,9] = 30
+    _tm_months[0,10] = _tm_months[1,10] = 31
+    _tm_months[0,11] = _tm_months[1,11] = 30
+    _tm_months[0,12] = _tm_months[1,12] = 31
+@}
+@c endfile
+@c @end group
+@end example
+
+The benefit of merging multiple @code{BEGIN} rules
+(@pxref{BEGIN/END, ,The @code{BEGIN} and @code{END} Special Patterns})
+is particularly clear when writing library files.  Functions in library
+files can cleanly initialize their own private data and also provide clean-up
+actions in private @code{END} rules.
+
+The next function is a simple one that computes whether a given year is or
+is not a leap year.  If a year is evenly divisible by four, but not evenly
+divisible by 100, or if it is evenly divisible by 400, then it is a leap
+year.  Thus, 1904 was a leap year, 1900 was not, but 2000 will be.
+@c Change this after the year 2000 to ``2000 was'' (:-)
+
+@findex _tm_isleap
+@example
+@group
+@c file eg/lib/mktime.awk
+# decide if a year is a leap year
+function _tm_isleap(year,    ret)
+@{
+    ret = (year % 4 == 0 && year % 100 != 0) ||
+            (year % 400 == 0)
+
+    return ret
+@}
+@c endfile
+@end group
+@end example
+
+This function is only used a few times in this file, and its computation
+could have been written @dfn{in-line} (at the point where it's used).
+Making it a separate function made the original development easier, and also
+avoids the possibility of typing errors when duplicating the code in
+multiple places.
+
+The next function is more interesting.  It does most of the work of
+generating a timestamp, which is converting a date and time into some number
+of seconds since the Epoch.  The caller passes an array (rather
+imaginatively named @code{a}) containing six
+values: the year including century, the month as a number between one and 12,
+the day of the month, the hour as a number between zero and 23, the minute in
+the hour, and the seconds within the minute.
+
+The function uses several local variables to precompute the number of
+seconds in an hour, seconds in a day, and seconds in a year.  Often,
+similar C code simply writes out the expression in-line, expecting the
+compiler to do @dfn{constant folding}.  E.g., most C compilers would
+turn @samp{60 * 60} into @samp{3600} at compile time, instead of recomputing
+it every time at run time.  Precomputing these values makes the
+function more efficient.
+
+@findex _tm_addup
+@example
+@c @group
+@c file eg/lib/mktime.awk
+# convert a date into seconds
+function _tm_addup(a,    total, yearsecs, daysecs,
+                         hoursecs, i, j)
+@{
+    hoursecs = 60 * 60
+    daysecs = 24 * hoursecs
+    yearsecs = 365 * daysecs
+
+    total = (a[1] - 1970) * yearsecs
+
+@group
+    # extra day for leap years
+    for (i = 1970; i < a[1]; i++)
+        if (_tm_isleap(i))
+            total += daysecs
+@end group
+
+@group
+    j = _tm_isleap(a[1])
+    for (i = 1; i < a[2]; i++)
+        total += _tm_months[j, i] * daysecs
+@end group
+
+    total += (a[3] - 1) * daysecs
+    total += a[4] * hoursecs
+    total += a[5] * 60
+    total += a[6]
+
+    return total
+@}
+@c endfile
+@c @end group
+@end example
+
+The function starts with a first approximation of all the seconds between
+Midnight, January 1, 1970,@footnote{This is the Epoch on POSIX systems.
+It may be different on other systems.} and the beginning of the current
+year.  It then goes through all those years, and for every leap year,
+adds an additional day's worth of seconds.
+
+The variable @code{j} holds either one or zero, if the current year is or is not
+a leap year.
+For every month in the current year prior to the current month, it adds
+the number of seconds in the month, using the appropriate entry in the
+@code{_tm_months} array.
+
+Finally, it adds in the seconds for the number of days prior to the current
+day, and the number of hours, minutes, and seconds in the current day.
+
+The result is a count of seconds since January 1, 1970.  This value is not
+yet what is needed though.  The reason why is described shortly.
+
+The main @code{mktime} function takes a single character string argument.
+This string is a representation of a date and time in a ``canonical''
+(fixed) form.  This string should be
+@code{"@var{year} @var{month} @var{day} @var{hour} @var{minute} @var{second}"}.
+
+@findex mktime
+@example
+@c @group
+@c file eg/lib/mktime.awk
+# mktime --- convert a date into seconds,
+#            compensate for time zone
+
+function mktime(str,    res1, res2, a, b, i, j, t, diff)
+@{
+    i = split(str, a, " ")    # don't rely on FS
+
+    if (i != 6)
+        return -1
+
+    # force numeric
+    for (j in a)
+        a[j] += 0
+
+@group
+    # validate
+    if (a[1] < 1970 ||
+        a[2] < 1 || a[2] > 12 ||
+        a[3] < 1 || a[3] > 31 ||
+        a[4] < 0 || a[4] > 23 ||
+        a[5] < 0 || a[5] > 59 ||
+        a[6] < 0 || a[6] > 61 )
+            return -1
+@end group
+
+    res1 = _tm_addup(a)
+    t = strftime("%Y %m %d %H %M %S", res1)
+
+    if (_tm_debug)
+        printf("(%s) -> (%s)\n", str, t) > "/dev/stderr"
+
+    split(t, b, " ")
+    res2 = _tm_addup(b)
+
+    diff = res1 - res2
+
+    if (_tm_debug)
+        printf("diff = %d seconds\n", diff) > "/dev/stderr"
+
+    res1 += diff
+
+    return res1
+@}
+@c endfile
+@c @end group
+@end example
+
+The function first splits the string into an array, using spaces and tabs as
+separators.  If there are not six elements in the array, it returns an
+error, signaled as the value @minus{}1.
+Next, it forces each element of the array to be numeric, by adding zero to it.
+The following @samp{if} statement then makes sure that each element is
+within an allowable range.  (This checking could be extended further, e.g.,
+to make sure that the day of the month is within the correct range for the
+particular month supplied.)  All of this is essentially preliminary set-up
+and error checking.
+
+Recall that @code{_tm_addup} generated a value in seconds since Midnight,
+January 1, 1970.  This value is not directly usable as the result we want,
+@emph{since the calculation does not account for the local timezone}.  In other
+words, the value represents the count in seconds since the Epoch, but only
+for UTC (Universal Coordinated Time).  If the local timezone is east or west
+of UTC, then some number of hours should be either added to, or subtracted from
+the resulting timestamp.
+
+For example, 6:23 p.m. in Atlanta, Georgia (USA), is normally five hours west
+of (behind) UTC.  It is only four hours behind UTC if daylight savings
+time is in effect.
+If you are calling @code{mktime} in Atlanta, with the argument
+@code{@w{"1993 5 23 18 23 12"}}, the result from @code{_tm_addup} will be
+for 6:23 p.m. UTC, which is only 2:23 p.m. in Atlanta.  It is necessary to
+add another four hours worth of seconds to the result.
+
+How can @code{mktime} determine how far away it is from UTC?  This is
+surprisingly easy.  The returned timestamp represents the time passed to
+@code{mktime} @emph{as UTC}.  This timestamp can be fed back to
+@code{strftime}, which will format it as a @emph{local} time; i.e.@: as
+if it already had the UTC difference added in to it.  This is done by
+giving @code{@w{"%Y %m %d %H %M %S"}} to @code{strftime} as the format
+argument.  It returns the computed timestamp in the original string
+format.  The result represents a time that accounts for the UTC
+difference.  When the new time is converted back to a timestamp, the
+difference between the two timestamps is the difference (in seconds)
+between the local timezone and UTC.  This difference is then added back
+to the original result.  An example demonstrating this is presented below.
+
+Finally, there is a ``main'' program for testing the function.
+
+@example
+@c @group
+@c file eg/lib/mktime.awk
+BEGIN  @{
+    if (_tm_test) @{
+        printf "Enter date as yyyy mm dd hh mm ss: "
+        getline _tm_test_date
+    
+        t = mktime(_tm_test_date)
+        r = strftime("%Y %m %d %H %M %S", t)
+        printf "Got back (%s)\n", r
+    @}
+@}
+@c endfile
+@c @end group
+@end example
+
+The entire program uses two variables that can be set on the command
+line to control debugging output and to enable the test in the final
+@code{BEGIN} rule.  Here is the result of a test run. (Note that debugging
+output is to standard error, and test output is to standard output.)
+
+@example
+@c @group
+$ gawk -f mktime.awk -v _tm_test=1 -v _tm_debug=1
+@print{} Enter date as yyyy mm dd hh mm ss: 1993 5 23 15 35 10
+@error{} (1993 5 23 15 35 10) -> (1993 05 23 11 35 10)
+@error{} diff = 14400 seconds
+@print{} Got back (1993 05 23 15 35 10)
+@c @end group
+@end example
+
+The time entered was 3:35 p.m. (15:35 on a 24-hour clock), on May 23, 1993.
+The first line
+of debugging output shows the resulting time as UTC---four hours ahead of
+the local time zone.  The second line shows that the difference is 14400
+seconds, which is four hours.  (The difference is only four hours, since
+daylight savings time is in effect during May.)
+The final line of test output shows that the timezone compensation
+algorithm works; the returned time is the same as the entered time.
+
+This program does not solve the general problem of turning an arbitrary date
+representation into a timestamp.  That problem is very involved.  However,
+the @code{mktime} function provides a foundation upon which to build. Other
+software can convert month names into numeric months, and AM/PM times into
+24-hour clocks, to generate the ``canonical'' format that @code{mktime}
+requires.
+
+@node Gettimeofday Function, Filetrans Function, Mktime Function, Library Functions
+@section Managing the Time of Day
+
+@cindex formatted timestamps
+@cindex timestamps, formatted
+The @code{systime} and @code{strftime} functions described in
+@ref{Time Functions, ,Functions for Dealing with Time Stamps},
+provide the minimum functionality necessary for dealing with the time of day
+in human readable form.  While @code{strftime} is extensive, the control
+formats are not necessarily easy to remember or intuitively obvious when
+reading a program.
+
+The following function, @code{gettimeofday}, populates a user-supplied array
+with pre-formatted time information.  It returns a string with the current
+time formatted in the same way as the @code{date} utility.
+
+@findex gettimeofday
+@example
+@c @group
+@c file eg/lib/gettime.awk
+# gettimeofday --- get the time of day in a usable format
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, Public Domain, May 1993
+#
+# Returns a string in the format of output of date(1)
+# Populates the array argument time with individual values:
+#    time["second"]       -- seconds (0 - 59)
+#    time["minute"]       -- minutes (0 - 59)
+#    time["hour"]         -- hours (0 - 23)
+#    time["althour"]      -- hours (0 - 12)
+#    time["monthday"]     -- day of month (1 - 31)
+#    time["month"]        -- month of year (1 - 12)
+#    time["monthname"]    -- name of the month
+#    time["shortmonth"]   -- short name of the month
+#    time["year"]         -- year within century (0 - 99)
+#    time["fullyear"]     -- year with century (19xx or 20xx)
+#    time["weekday"]      -- day of week (Sunday = 0)
+#    time["altweekday"]   -- day of week (Monday = 0)
+#    time["weeknum"]      -- week number, Sunday first day
+#    time["altweeknum"]   -- week number, Monday first day
+#    time["dayname"]      -- name of weekday
+#    time["shortdayname"] -- short name of weekday
+#    time["yearday"]      -- day of year (0 - 365)
+#    time["timezone"]     -- abbreviation of timezone name
+#    time["ampm"]         -- AM or PM designation
+
+@group
+function gettimeofday(time,    ret, now, i)
+@{
+    # get time once, avoids unnecessary system calls
+    now = systime()
+
+    # return date(1)-style output
+    ret = strftime("%a %b %d %H:%M:%S %Z %Y", now)
+
+    # clear out target array
+    for (i in time)
+        delete time[i]
+@end group
+
+@group
+    # fill in values, force numeric values to be
+    # numeric by adding 0
+    time["second"]       = strftime("%S", now) + 0
+    time["minute"]       = strftime("%M", now) + 0
+    time["hour"]         = strftime("%H", now) + 0
+    time["althour"]      = strftime("%I", now) + 0
+    time["monthday"]     = strftime("%d", now) + 0
+    time["month"]        = strftime("%m", now) + 0
+    time["monthname"]    = strftime("%B", now)
+    time["shortmonth"]   = strftime("%b", now)
+    time["year"]         = strftime("%y", now) + 0
+    time["fullyear"]     = strftime("%Y", now) + 0
+    time["weekday"]      = strftime("%w", now) + 0
+    time["altweekday"]   = strftime("%u", now) + 0
+    time["dayname"]      = strftime("%A", now)
+    time["shortdayname"] = strftime("%a", now)
+    time["yearday"]      = strftime("%j", now) + 0
+    time["timezone"]     = strftime("%Z", now)
+    time["ampm"]         = strftime("%p", now)
+    time["weeknum"]      = strftime("%U", now) + 0
+    time["altweeknum"]   = strftime("%W", now) + 0
+
+    return ret
+@}
+@end group
+@c endfile
+@end example
+
+The string indices are easier to use and read than the various formats
+required by @code{strftime}.  The @code{alarm} program presented in
+@ref{Alarm Program, ,An Alarm Clock Program},
+uses this function.
+
+@c exercise!!!
+The @code{gettimeofday} function is presented above as it was written. A
+more general design for this function would have allowed the user to supply
+an optional timestamp value that would have been used instead of the current
+time.
+
+@node Filetrans Function, Getopt Function, Gettimeofday Function, Library Functions
+@section Noting Data File Boundaries
+
+@cindex per file initialization and clean-up
+The @code{BEGIN} and @code{END} rules are each executed exactly once, at
+the beginning and end respectively of your @code{awk} program
+(@pxref{BEGIN/END, ,The @code{BEGIN} and @code{END} Special Patterns}).
+We (the @code{gawk} authors) once had a user who mistakenly thought that the
+@code{BEGIN} rule was executed at the beginning of each data file and the
+@code{END} rule was executed at the end of each data file.  When informed
+that this was not the case, the user requested that we add new special
+patterns to @code{gawk}, named @code{BEGIN_FILE} and @code{END_FILE}, that
+would have the desired behavior.  He even supplied us the code to do so.
+
+However, after a little thought, I came up with the following library program.
+It arranges to call two user-supplied functions, @code{beginfile} and
+@code{endfile}, at the beginning and end of each data file.
+Besides solving the problem in only nine(!) lines of code, it does so
+@emph{portably}; this will work with any implementation of @code{awk}.
+
+@example
+@c @group
+# transfile.awk
+#
+# Give the user a hook for filename transitions
+#
+# The user must supply functions beginfile() and endfile()
+# that each take the name of the file being started or
+# finished, respectively.
+#
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, January 1992
+# Public Domain
+
+FILENAME != _oldfilename \
+@{
+    if (_oldfilename != "")
+        endfile(_oldfilename)
+    _oldfilename = FILENAME
+    beginfile(FILENAME)
+@}
+
+END   @{ endfile(FILENAME) @}
+@c @end group
+@end example
+
+This file must be loaded before the user's ``main'' program, so that the
+rule it supplies will be executed first.
+
+This rule relies on @code{awk}'s @code{FILENAME} variable that
+automatically changes for each new data file.  The current file name is
+saved in a private variable, @code{_oldfilename}.  If @code{FILENAME} does
+not equal @code{_oldfilename}, then a new data file is being processed, and
+it is necessary to call @code{endfile} for the old file.  Since
+@code{endfile} should only be called if a file has been processed, the
+program first checks to make sure that @code{_oldfilename} is not the null
+string.  The program then assigns the current file name to
+@code{_oldfilename}, and calls @code{beginfile} for the file.
+Since, like all @code{awk} variables, @code{_oldfilename} will be
+initialized to the null string, this rule executes correctly even for the
+first data file.
+
+The program also supplies an @code{END} rule, to do the final processing for
+the last file.  Since this @code{END} rule comes before any @code{END} rules
+supplied in the ``main'' program, @code{endfile} will be called first.  Once
+again the value of multiple @code{BEGIN} and @code{END} rules should be clear.
+
+@findex beginfile
+@findex endfile
+This version has same problem as the first version of @code{nextfile}
+(@pxref{Nextfile Function, ,Implementing @code{nextfile} as a Function}).
+If the same data file occurs twice in a row on command line, then
+@code{endfile} and @code{beginfile} will not be executed at the end of the
+first pass and at the beginning of the second pass.
+This version solves the problem.
+
+@example
+@c @group
+@c file eg/lib/ftrans.awk
+# ftrans.awk --- handle data file transitions
+#
+# user supplies beginfile() and endfile() functions
+#
+# Arnold Robbins, arnold@@gnu.ai.mit.edu. November 1992
+# Public Domain
+
+FNR == 1 @{
+    if (_filename_ != "")
+        endfile(_filename_)
+    _filename_ = FILENAME
+    beginfile(FILENAME)
+@}
+
+END  @{ endfile(_filename_) @}
+@c endfile
+@c @end group
+@end example
+
+In @ref{Wc Program, ,Counting Things},
+you will see how this library function can be used, and
+how it simplifies writing the main program.
+
+@node Getopt Function, Passwd Functions, Filetrans Function, Library Functions
+@section Processing Command Line Options
+
+@cindex @code{getopt}, C version
+@cindex processing arguments
+@cindex argument processing
+Most utilities on POSIX compatible systems take options or ``switches'' on
+the command line that can be used to change the way a program behaves.
+@code{awk} is an example of such a program
+(@pxref{Options, ,Command Line Options}).
+Often, options take @dfn{arguments}, data that the program needs to
+correctly obey the command line option.  For example, @code{awk}'s
+@samp{-F} option requires a string to use as the field separator.
+The first occurrence on the command line of either @samp{--} or a
+string that does not begin with @samp{-} ends the options.
+
+Most Unix systems provide a C function named @code{getopt} for processing
+command line arguments.  The programmer provides a string describing the one
+letter options. If an option requires an argument, it is followed in the
+string with a colon.  @code{getopt} is also passed the
+count and values of the command line arguments, and is called in a loop.
+@code{getopt} processes the command line arguments for option letters.
+Each time around the loop, it returns a single character representing the
+next option letter that it found, or @samp{?} if it found an invalid option.
+When it returns @minus{}1, there are no options left on the command line.
+
+When using @code{getopt}, options that do not take arguments can be
+grouped together.  Furthermore, options that take arguments require that the
+argument be present.  The argument can immediately follow the option letter,
+or it can be a separate command line argument.
+
+Given a hypothetical program that takes
+three command line options, @samp{-a}, @samp{-b}, and @samp{-c}, and
+@samp{-b} requires an argument, all of the following are valid ways of
+invoking the program:
+
+@example
+@c @group
+prog -a -b foo -c data1 data2 data3
+prog -ac -bfoo -- data1 data2 data3
+prog -acbfoo data1 data2 data3
+@c @end group
+@end example
+
+Notice that when the argument is grouped with its option, the rest of
+the command line argument is considered to be the option's argument.
+In the above example, @samp{-acbfoo} indicates that all of the
+@samp{-a}, @samp{-b}, and @samp{-c} options were supplied,
+and that @samp{foo} is the argument to the @samp{-b} option.
+
+@code{getopt} provides four external variables that the programmer can use.
+
+@table @code
+@item optind
+The index in the argument value array (@code{argv}) where the first
+non-option command line argument can be found.
+
+@item optarg
+The string value of the argument to an option.
+
+@item opterr
+Usually @code{getopt} prints an error message when it finds an invalid
+option.  Setting @code{opterr} to zero disables this feature.  (An
+application might wish to print its own error message.)
+
+@item optopt
+The letter representing the command line option.
+While not usually documented, most versions supply this variable.
+@end table
+
+The following C fragment shows how @code{getopt} might process command line
+arguments for @code{awk}.
+
+@example
+@group
+int
+main(int argc, char *argv[])
+@{
+    @dots{}
+    /* print our own message */
+    opterr = 0;
+@end group
+@group
+    while ((c = getopt(argc, argv, "v:f:F:W:")) != -1) @{
+        switch (c) @{
+        case 'f':    /* file */
+            @dots{}
+            break;
+        case 'F':    /* field separator */
+            @dots{}
+            break;
+        case 'v':    /* variable assignment */
+            @dots{}
+            break;
+        case 'W':    /* extension */
+            @dots{}
+            break;
+        case '?':
+        default:
+            usage();
+            break;
+        @}
+    @}
+    @dots{}
+@}
+@end group
+@end example
+
+As a side point, @code{gawk} actually uses the GNU @code{getopt_long}
+function to process both normal and GNU-style long options
+(@pxref{Options, ,Command Line Options}).
+
+The abstraction provided by @code{getopt} is very useful, and would be quite
+handy in @code{awk} programs as well.  Here is an @code{awk} version of
+@code{getopt}.  This function highlights one of the greatest weaknesses in
+@code{awk}, which is that it is very poor at manipulating single characters.
+Repeated calls to @code{substr} are necessary for accessing individual
+characters (@pxref{String Functions, ,Built-in Functions for String Manipulation}).
+
+The discussion walks through the code a bit at a time.
+
+@example
+@c @group
+@c file eg/lib/getopt.awk
+# getopt --- do C library getopt(3) function in awk
+#
+# arnold@@gnu.ai.mit.edu
+# Public domain
+#
+# Initial version: March, 1991
+# Revised: May, 1993
+
+# External variables:
+#    Optind -- index of ARGV for first non-option argument
+#    Optarg -- string value of argument to current option
+#    Opterr -- if non-zero, print our own diagnostic
+#    Optopt -- current option letter
+
+# Returns
+#    -1     at end of options
+#    ?      for unrecognized option
+#    <c>    a character representing the current option
+
+# Private Data
+#    _opti  index in multi-flag option, e.g., -abc
+@c endfile
+@c @end group
+@end example
+
+The function starts out with some documentation: who wrote the code,
+and when it was revised, followed by a list of the global variables it uses,
+what the return values are and what they mean, and any global variables that
+are ``private'' to this library function.  Such documentation is essential
+for any program, and particularly for library functions.
+
+@findex getopt
+@example
+@c @group
+@c file eg/lib/getopt.awk
+function getopt(argc, argv, options,    optl, thisopt, i)
+@{
+    optl = length(options)
+    if (optl == 0)        # no options given
+        return -1
+
+    if (argv[Optind] == "--") @{  # all done
+        Optind++
+        _opti = 0
+        return -1
+    @} else if (argv[Optind] !~ /^-[^: \t\n\f\r\v\b]/) @{
+        _opti = 0
+        return -1
+    @}
+@c endfile
+@c @end group
+@end example
+
+The function first checks that it was indeed called with a string of options
+(the @code{options} parameter).  If @code{options} has a zero length,
+@code{getopt} immediately returns @minus{}1.
+
+The next thing to check for is the end of the options.  A @samp{--} ends the
+command line options, as does any command line argument that does not begin
+with a @samp{-}.  @code{Optind} is used to step through the array of command
+line arguments; it retains its value across calls to @code{getopt}, since it
+is a global variable.
+
+The regexp used, @code{@w{/^-[^: \t\n\f\r\v\b]/}}, is
+perhaps a bit of overkill; it checks for a @samp{-} followed by anything
+that is not whitespace and not a colon.
+If the current command line argument does not match this pattern,
+it is not an option, and it ends option processing.
+
+@example
+@group
+@c file eg/lib/getopt.awk
+    if (_opti == 0)
+        _opti = 2
+    thisopt = substr(argv[Optind], _opti, 1)
+    Optopt = thisopt
+    i = index(options, thisopt)
+    if (i == 0) @{
+        if (Opterr)
+            printf("%c -- invalid option\n",
+                                  thisopt) > "/dev/stderr"
+        if (_opti >= length(argv[Optind])) @{
+            Optind++
+            _opti = 0
+        @} else
+            _opti++
+        return "?"
+    @}
+@c endfile
+@end group
+@end example
+
+The @code{_opti} variable tracks the position in the current command line
+argument (@code{argv[Optind]}).  In the case that multiple options were
+grouped together with one @samp{-} (e.g., @samp{-abx}), it is necessary
+to return them to the user one at a time.
+
+If @code{_opti} is equal to zero, it is set to two, the index in the string
+of the next character to look at (we skip the @samp{-}, which is at position
+one).  The variable @code{thisopt} holds the character, obtained with
+@code{substr}.  It is saved in @code{Optopt} for the main program to use.
+
+If @code{thisopt} is not in the @code{options} string, then it is an
+invalid option.  If @code{Opterr} is non-zero, @code{getopt} prints an error
+message on the standard error that is similar to the message from the C
+version of @code{getopt}.
+
+Since the option is invalid, it is necessary to skip it and move on to the
+next option character.  If @code{_opti} is greater than or equal to the
+length of the current command line argument, then it is necessary to move on
+to the next one, so @code{Optind} is incremented and @code{_opti} is reset
+to zero. Otherwise, @code{Optind} is left alone and @code{_opti} is merely
+incremented.
+
+In any case, since the option was invalid, @code{getopt} returns @samp{?}.
+The main program can examine @code{Optopt} if it needs to know what the
+invalid option letter actually was.
+
+@example
+@group
+@c file eg/lib/getopt.awk
+    if (substr(options, i + 1, 1) == ":") @{
+        # get option argument
+        if (length(substr(argv[Optind], _opti + 1)) > 0)
+            Optarg = substr(argv[Optind], _opti + 1)
+        else
+            Optarg = argv[++Optind]
+        _opti = 0
+    @} else
+        Optarg = ""
+@c endfile
+@end group
+@end example
+
+If the option requires an argument, the option letter is followed by a colon
+in the @code{options} string.  If there are remaining characters in the
+current command line argument (@code{argv[Optind]}), then the rest of that
+string is assigned to @code{Optarg}.  Otherwise, the next command line
+argument is used (@samp{-xFOO} vs. @samp{@w{-x FOO}}). In either case,
+@code{_opti} is reset to zero, since there are no more characters left to
+examine in the current command line argument.
+
+@example
+@c @group
+@c file eg/lib/getopt.awk
+    if (_opti == 0 || _opti >= length(argv[Optind])) @{
+        Optind++
+        _opti = 0
+    @} else
+        _opti++
+    return thisopt
+@}
+@c endfile
+@c @end group
+@end example
+
+Finally, if @code{_opti} is either zero or greater than the length of the
+current command line argument, it means this element in @code{argv} is
+through being processed, so @code{Optind} is incremented to point to the
+next element in @code{argv}.  If neither condition is true, then only
+@code{_opti} is incremented, so that the next option letter can be processed
+on the next call to @code{getopt}.
+
+@example
+@c @group
+@c file eg/lib/getopt.awk
+BEGIN @{
+    Opterr = 1    # default is to diagnose
+    Optind = 1    # skip ARGV[0]
+
+    # test program
+    if (_getopt_test) @{
+        while ((_go_c = getopt(ARGC, ARGV, "ab:cd")) != -1)
+            printf("c = <%c>, optarg = <%s>\n",
+                                       _go_c, Optarg)
+        printf("non-option arguments:\n")
+        for (; Optind < ARGC; Optind++)
+            printf("\tARGV[%d] = <%s>\n",
+                                    Optind, ARGV[Optind])
+    @}
+@}
+@c endfile
+@c @end group
+@end example
+
+The @code{BEGIN} rule initializes both @code{Opterr} and @code{Optind} to one.
+@code{Opterr} is set to one, since the default behavior is for @code{getopt}
+to print a diagnostic message upon seeing an invalid option.  @code{Optind}
+is set to one, since there's no reason to look at the program name, which is
+in @code{ARGV[0]}.
+
+The rest of the @code{BEGIN} rule is a simple test program.  Here is the
+result of two sample runs of the test program.
+
+@example
+@group
+$ awk -f getopt.awk -v _getopt_test=1 -- -a -cbARG bax -x
+@print{} c = <a>, optarg = <>
+@print{} c = <c>, optarg = <>
+@print{} c = <b>, optarg = <ARG>
+@print{} non-option arguments:
+@print{}         ARGV[3] = <bax>
+@print{}         ARGV[4] = <-x>
+@end group
+
+@group
+$ awk -f getopt.awk -v _getopt_test=1 -- -a -x -- xyz abc
+@print{} c = <a>, optarg = <>
+@error{} x -- invalid option
+@print{} c = <?>, optarg = <>
+@print{} non-option arguments:
+@print{}         ARGV[4] = <xyz>
+@print{}         ARGV[5] = <abc>
+@end group
+@end example
+
+The first @samp{--} terminates the arguments to @code{awk}, so that it does
+not try to interpret the @samp{-a} etc. as its own options.
+
+Several of the sample programs presented in
+@ref{Sample Programs, ,Practical @code{awk} Programs},
+use @code{getopt} to process their arguments.
+
+@node Passwd Functions, Group Functions, Getopt Function, Library Functions
+@section Reading the User Database
+
+@cindex @file{/dev/user}
+The @file{/dev/user} special file
+(@pxref{Special Files, ,Special File Names in @code{gawk}})
+provides access to the current user's real and effective user and group id
+numbers, and if available, the user's supplementary group set.
+However, since these are numbers, they do not provide very useful
+information to the average user.  There needs to be some way to find the
+user information associated with the user and group numbers.  This
+section presents a suite of functions for retrieving information from the
+user database.  @xref{Group Functions, ,Reading the Group Database},
+for a similar suite that retrieves information from the group database.
+
+@cindex @code{getpwent}, C version
+@cindex user information
+@cindex login information
+@cindex account information
+@cindex password file
+The POSIX standard does not define the file where user information is
+kept.  Instead, it provides the @code{<pwd.h>} header file
+and several C language subroutines for obtaining user information.
+The primary function is @code{getpwent}, for ``get password entry.''
+The ``password'' comes from the original user database file,
+@file{/etc/passwd}, which kept user information, along with the
+encrypted passwords (hence the name).
+
+While an @code{awk} program could simply read @file{/etc/passwd} directly
+(the format is well known), because of the way password
+files are handled on networked systems,
+this file may not contain complete information about the system's set of users.
+
+@cindex @code{pwcat} program
+To be sure of being
+able to produce a readable, complete version of the user database, it is
+necessary to write a small C program that calls @code{getpwent}.
+@code{getpwent} is defined to return a pointer to a @code{struct passwd}.
+Each time it is called, it returns the next entry in the database.
+When there are no more entries, it returns @code{NULL}, the null pointer.
+When this happens, the C program should call @code{endpwent} to close the
+database.
+Here is @code{pwcat}, a C program that ``cats'' the password database.
+
+@findex pwcat.c
+@example
+@c @group
+@c file eg/lib/pwcat.c
+/*
+ * pwcat.c
+ *
+ * Generate a printable version of the password database
+ *
+ * Arnold Robbins
+ * arnold@@gnu.ai.mit.edu
+ * May 1993
+ * Public Domain
+ */
+
+#include <stdio.h>
+#include <pwd.h>
+
+int
+main(argc, argv)
+int argc;
+char **argv;
+@{
+    struct passwd *p;
+
+    while ((p = getpwent()) != NULL)
+        printf("%s:%s:%d:%d:%s:%s:%s\n",
+            p->pw_name, p->pw_passwd, p->pw_uid,
+            p->pw_gid, p->pw_gecos, p->pw_dir, p->pw_shell);
+
+    endpwent();
+    exit(0);
+@}
+@c endfile
+@c @end group
+@end example
+
+If you don't understand C, don't worry about it.
+The output from @code{pwcat} is the user database, in the traditional
+@file{/etc/passwd} format of colon-separated fields.  The fields are:
+
+@table @asis
+@item Login name
+The user's login name.
+
+@item Encrypted password
+The user's encrypted password.  This may not be available on some systems.
+
+@item User-ID
+The user's numeric user-id number.
+
+@item Group-ID
+The user's numeric group-id number.
+
+@item Full name
+The user's full name, and perhaps other information associated with the
+user.
+
+@item Home directory
+The user's login, or ``home'' directory (familiar to shell programmers as
+@code{$HOME}).
+
+@item Login shell
+The program that will be run when the user logs in.  This is usually a
+shell, such as Bash (the Gnu Bourne-Again shell).
+@end table
+
+Here are a few lines representative of @code{pwcat}'s output.
+
+@example
+@c @group
+$ pwcat
+@print{} root:3Ov02d5VaUPB6:0:1:Operator:/:/bin/sh
+@print{} nobody:*:65534:65534::/:
+@print{} daemon:*:1:1::/:
+@print{} sys:*:2:2::/:/bin/csh
+@print{} bin:*:3:3::/bin:
+@print{} arnold:xyzzy:2076:10:Arnold Robbins:/home/arnold:/bin/sh
+@print{} miriam:yxaay:112:10:Miriam Robbins:/home/miriam:/bin/sh
+@dots{}
+@c @end group
+@end example
+
+With that introduction, here is a group of functions for getting user
+information.  There are several functions here, corresponding to the C
+functions of the same name.
+
+@findex _pw_init
+@example
+@c file eg/lib/passwdawk.in
+@group
+# passwd.awk --- access password file information
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, Public Domain
+# May 1993
+
+BEGIN @{
+    # tailor this to suit your system
+    _pw_awklib = "/usr/local/libexec/awk/"
+@}
+@end group
+
+function _pw_init(    oldfs, oldrs, olddol0, pwcat)
+@{
+    if (_pw_inited)
+        return
+    oldfs = FS
+    oldrs = RS
+    olddol0 = $0
+    FS = ":"
+    RS = "\n"
+    pwcat = _pw_awklib "pwcat"
+    while ((pwcat | getline) > 0) @{
+        _pw_byname[$1] = $0
+        _pw_byuid[$3] = $0
+        _pw_bycount[++_pw_total] = $0
+    @}
+    close(pwcat)
+    _pw_count = 0
+    _pw_inited = 1
+    FS = oldfs
+    RS = oldrs
+    $0 = olddol0
+@}
+@c endfile
+@c @end group
+@end example
+
+The @code{BEGIN} rule sets a private variable to the directory where
+@code{pwcat} is stored.  Since it is used to help out an @code{awk} library
+routine, we have chosen to put it in @file{/usr/local/libexec/awk}.
+You might want it to be in a different directory on your system.
+
+The function @code{_pw_init} keeps three copies of the user information
+in three associative arrays.  The arrays are indexed by user name
+(@code{_pw_byname}), by user-id number (@code{_pw_byuid}), and by order of
+occurrence (@code{_pw_bycount}).
+
+The variable @code{_pw_inited} is used for efficiency; @code{_pw_init} only
+needs to be called once.
+
+Since this function uses @code{getline} to read information from
+@code{pwcat}, it first saves the values of @code{FS}, @code{RS}, and
+@code{$0}.  Doing so is necessary, since these functions could be called
+from anywhere within a user's program, and the user may have his or her
+own values for @code{FS} and @code{RS}.
+@ignore
+Problem, what if FIELDWIDTHS is in use? Sigh.
+@end ignore
+
+The main part of the function uses a loop to read database lines, split
+the line into fields, and then store the line into each array as necessary.
+When the loop is done, @code{@w{_pw_init}} cleans up by closing the pipeline,
+setting @code{@w{_pw_inited}} to one, and restoring @code{FS}, @code{RS}, and
+@code{$0}.  The use of @code{@w{_pw_count}} will be explained below.
+
+@findex getpwnam
+@example
+@group
+@c file eg/lib/passwdawk.in
+function getpwnam(name)
+@{
+    _pw_init()
+    if (name in _pw_byname)
+        return _pw_byname[name]
+    return ""
+@}
+@c endfile
+@end group
+@end example
+
+The @code{getpwnam} function takes a user name as a string argument. If that
+user is in the database, it returns the appropriate line. Otherwise it
+returns the null string.
+
+@findex getpwuid
+@example
+@group
+@c file eg/lib/passwdawk.in
+function getpwuid(uid)
+@{
+    _pw_init()
+    if (uid in _pw_byuid)
+        return _pw_byuid[uid]
+    return ""
+@}
+@c endfile
+@end group
+@end example
+
+Similarly,
+the @code{getpwuid} function takes a user-id number argument. If that
+user number is in the database, it returns the appropriate line. Otherwise it
+returns the null string.
+
+@findex getpwent
+@example
+@c @group
+@c file eg/lib/passwdawk.in
+function getpwent()
+@{
+    _pw_init()
+    if (_pw_count < _pw_total)
+        return _pw_bycount[++_pw_count]
+    return ""
+@}
+@c endfile
+@c @end group
+@end example
+
+The @code{getpwent} function simply steps through the database, one entry at
+a time.  It uses @code{_pw_count} to track its current position in the
+@code{_pw_bycount} array.
+
+@findex endpwent
+@example
+@c @group
+@c file eg/lib/passwdawk.in
+function endpwent()
+@{
+    _pw_count = 0
+@}
+@c endfile
+@c @end group
+@end example
+
+The @code{@w{endpwent}} function resets @code{@w{_pw_count}} to zero, so that
+subsequent calls to @code{getpwent} will start over again.
+
+A conscious design decision in this suite is that each subroutine calls
+@code{@w{_pw_init}} to initialize the database arrays.  The overhead of running
+a separate process to generate the user database, and the I/O to scan it,
+will only be incurred if the user's main program actually calls one of these
+functions.  If this library file is loaded along with a user's program, but
+none of the routines are ever called, then there is no extra run-time overhead.
+(The alternative would be to move the body of @code{@w{_pw_init}} into a
+@code{BEGIN} rule, which would always run @code{pwcat}.  This simplifies the
+code but runs an extra process that may never be needed.)
+
+In turn, calling @code{_pw_init} is not too expensive, since the
+@code{_pw_inited} variable keeps the program from reading the data more than
+once.  If you are worried about squeezing every last cycle out of your
+@code{awk} program, the check of @code{_pw_inited} could be moved out of
+@code{_pw_init} and duplicated in all the other functions.  In practice,
+this is not necessary, since most @code{awk} programs are I/O bound, and it
+would clutter up the code.
+
+The @code{id} program in @ref{Id Program, ,Printing Out User Information},
+uses these functions.
+
+@node Group Functions, Library Names, Passwd Functions, Library Functions
+@section Reading the Group Database
+
+@cindex @code{getgrent}, C version
+@cindex group information
+@cindex account information
+@cindex group file
+Much of the discussion presented in
+@ref{Passwd Functions, ,Reading the User Database},
+applies to the group database as well.  Although there has traditionally
+been a well known file, @file{/etc/group}, in a well known format, the POSIX
+standard only provides a set of C library routines
+(@code{<grp.h>} and @code{getgrent})
+for accessing the information.
+Even though this file may exist, it likely does not have
+complete information.  Therefore, as with the user database, it is necessary
+to have a small C program that generates the group database as its output.
+
+@cindex @code{grcat} program
+Here is @code{grcat}, a C program that ``cats'' the group database.
+
+@findex grcat.c
+@example
+@c @group
+@c file eg/lib/grcat.c
+/*
+ * grcat.c
+ *
+ * Generate a printable version of the group database
+ *
+ * Arnold Robbins, arnold@@gnu.ai.mit.edu
+ * May 1993
+ * Public Domain
+ */
+
+#include <stdio.h>
+#include <grp.h>
+
+@group
+int
+main(argc, argv)
+int argc;
+char **argv;
+@{
+    struct group *g;
+    int i;
+@end group
+
+    while ((g = getgrent()) != NULL) @{
+        printf("%s:%s:%d:", g->gr_name, g->gr_passwd,
+                                            g->gr_gid);
+        for (i = 0; g->gr_mem[i] != NULL; i++) @{
+            printf("%s", g->gr_mem[i]);
+            if (g->gr_mem[i+1] != NULL)
+                putchar(',');
+        @}
+        putchar('\n');
+    @}
+    endgrent();
+    exit(0);
+@}
+@c endfile
+@c @end group
+@end example
+
+Each line in the group database represent one group.  The fields are
+separated with colons, and represent the following information.
+
+@table @asis
+@item Group Name
+The name of the group.
+
+@item Group Password
+The encrypted group password. In practice, this field is never used. It is
+usually empty, or set to @samp{*}.
+
+@item Group ID Number
+The numeric group-id number. This number should be unique within the file.
+
+@item Group Member List
+A comma-separated list of user names.  These users are members of the group.
+Most Unix systems allow users to be members of several groups
+simultaneously.  If your system does, then reading @file{/dev/user} will
+return those group-id numbers in @code{$5} through @code{$NF}.
+(Note that @file{/dev/user} is a @code{gawk} extension;
+@pxref{Special Files, ,Special File Names in @code{gawk}}.)
+@end table
+
+@iftex
+@page
+@end iftex
+Here is what running @code{grcat} might produce:
+
+@example
+@group
+$ grcat
+@print{} wheel:*:0:arnold
+@print{} nogroup:*:65534:
+@print{} daemon:*:1:
+@print{} kmem:*:2:
+@print{} staff:*:10:arnold,miriam,andy
+@print{} other:*:20:
+@dots{}
+@end group
+@end example
+
+Here are the functions for obtaining information from the group database.
+There are several, modeled after the C library functions of the same names.
+
+@findex _gr_init
+@example
+@group
+@c file eg/lib/groupawk.in
+# group.awk --- functions for dealing with the group file
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, Public Domain
+# May 1993
+
+BEGIN    \
+@{
+    # Change to suit your system
+    _gr_awklib = "/usr/local/libexec/awk/"
+@}
+@c endfile
+@end group
+
+@group
+@c file eg/lib/groupawk.in
+function _gr_init(    oldfs, oldrs, olddol0, grcat, n, a, i)
+@{
+    if (_gr_inited)
+        return
+@end group
+
+@group
+    oldfs = FS
+    oldrs = RS
+    olddol0 = $0
+    FS = ":"
+    RS = "\n"
+@end group
+
+@group
+    grcat = _gr_awklib "grcat"
+    while ((grcat | getline) > 0) @{
+        if ($1 in _gr_byname)
+            _gr_byname[$1] = _gr_byname[$1] "," $4
+        else
+            _gr_byname[$1] = $0
+        if ($3 in _gr_bygid)
+            _gr_bygid[$3] = _gr_bygid[$3] "," $4
+        else
+            _gr_bygid[$3] = $0
+
+        n = split($4, a, "[ \t]*,[ \t]*")
+@end group
+@group
+        for (i = 1; i <= n; i++)
+            if (a[i] in _gr_groupsbyuser)
+                _gr_groupsbyuser[a[i]] = \
+                    _gr_groupsbyuser[a[i]] " " $1
+            else
+                _gr_groupsbyuser[a[i]] = $1
+@end group
+
+@group
+        _gr_bycount[++_gr_count] = $0
+    @}
+@end group
+@group
+    close(grcat)
+    _gr_count = 0
+    _gr_inited++
+    FS = oldfs
+    RS = oldrs
+    $0 = olddol0
+@}
+@c endfile
+@end group
+@end example
+
+The @code{BEGIN} rule sets a private variable to the directory where
+@code{grcat} is stored.  Since it is used to help out an @code{awk} library
+routine, we have chosen to put it in @file{/usr/local/libexec/awk}.  You might
+want it to be in a different directory on your system.
+
+These routines follow the same general outline as the user database routines
+(@pxref{Passwd Functions, ,Reading the User Database}).
+The @code{@w{_gr_inited}} variable is used to
+ensure that the database is scanned no more than once.
+The @code{@w{_gr_init}} function first saves @code{FS}, @code{RS}, and
+@code{$0}, and then sets @code{FS} and @code{RS} to the correct values for
+scanning the group information.
+
+The group information is stored is several associative arrays.
+The arrays are indexed by group name (@code{@w{_gr_byname}}), by group-id number
+(@code{@w{_gr_bygid}}), and by position in the database (@code{@w{_gr_bycount}}).
+There is an additional array indexed by user name (@code{@w{_gr_groupsbyuser}}),
+that is a space separated list of groups that each user belongs to.
+
+Unlike the user database, it is possible to have multiple records in the
+database for the same group.  This is common when a group has a large number
+of members.  Such a pair of entries might look like:
+
+@example
+tvpeople:*:101:johny,jay,arsenio
+tvpeople:*:101:david,conan,tom,joan
+@end example
+
+For this reason, @code{_gr_init} looks to see if a group name or
+group-id number has already been seen.  If it has, then the user names are
+simply concatenated onto the previous list of users.  (There is actually a
+subtle problem with the code presented above.  Suppose that
+the first time there were no names. This code adds the names with
+a leading comma. It also doesn't check that there is a @code{$4}.)
+
+Finally, @code{_gr_init} closes the pipeline to @code{grcat}, restores
+@code{FS}, @code{RS}, and @code{$0}, initializes @code{_gr_count} to zero
+(it is used later), and makes @code{_gr_inited} non-zero.
+
+@findex getgrnam
+@example
+@c @group
+@c file eg/lib/groupawk.in
+function getgrnam(group)
+@{
+    _gr_init()
+    if (group in _gr_byname)
+        return _gr_byname[group]
+    return ""
+@}
+@c endfile
+@c @end group
+@end example
+
+The @code{getgrnam} function takes a group name as its argument, and if that
+group exists, it is returned. Otherwise, @code{getgrnam} returns the null
+string.
+
+@findex getgrgid
+@example
+@c @group
+@c file eg/lib/groupawk.in
+function getgrgid(gid)
+@{
+    _gr_init()
+    if (gid in _gr_bygid)
+        return _gr_bygid[gid]
+    return ""
+@}
+@c endfile
+@c @end group
+@end example
+
+The @code{getgrgid} function is similar, it takes a numeric group-id, and
+looks up the information associated with that group-id.
+
+@findex getgruser
+@example
+@group
+@c file eg/lib/groupawk.in
+function getgruser(user)
+@{
+    _gr_init()
+    if (user in _gr_groupsbyuser)
+        return _gr_groupsbyuser[user]
+    return ""
+@}
+@c endfile
+@end group
+@end example
+
+The @code{getgruser} function does not have a C counterpart. It takes a
+user name, and returns the list of groups that have the user as a member.
+
+@findex getgrent
+@example
+@c @group
+@c file eg/lib/groupawk.in
+function getgrent()
+@{
+    _gr_init()
+    if (++gr_count in _gr_bycount)
+        return _gr_bycount[_gr_count]
+    return ""
+@}
+@c endfile
+@c @end group
+@end example
+
+The @code{getgrent} function steps through the database one entry at a time.
+It uses @code{_gr_count} to track its position in the list.
+
+@findex endgrent
+@example
+@group
+@c file eg/lib/groupawk.in
+function endgrent()
+@{
+    _gr_count = 0
+@}
+@c endfile
+@end group
+@end example
+
+@code{endgrent} resets @code{_gr_count} to zero so that @code{getgrent} can
+start over again.
+
+As with the user database routines, each function calls @code{_gr_init} to
+initialize the arrays.  Doing so only incurs the extra overhead of running
+@code{grcat} if these functions are used (as opposed to moving the body of
+@code{_gr_init} into a @code{BEGIN} rule).
+
+Most of the work is in scanning the database and building the various
+associative arrays.  The functions that the user calls are themselves very
+simple, relying on @code{awk}'s associative arrays to do work.
+
+The @code{id} program in @ref{Id Program, ,Printing Out User Information},
+uses these functions.
+
+@node Library Names,  , Group Functions, Library Functions
+@section Naming Library Function Global Variables
+
+@cindex namespace issues in @code{awk}
+@cindex documenting @code{awk} programs
+@cindex programs, documenting
+Due to the way the @code{awk} language evolved, variables are either
+@dfn{global} (usable by the entire program), or @dfn{local} (usable just by
+a specific function).  There is no intermediate state analogous to
+@code{static} variables in C.
+
+Library functions often need to have global variables that they can use to
+preserve state information between calls to the function. For example,
+@code{getopt}'s variable @code{_opti}
+(@pxref{Getopt Function, ,Processing Command Line Options}),
+and the @code{_tm_months} array used by @code{mktime}
+(@pxref{Mktime Function, ,Turning Dates Into Timestamps}).
+Such variables are called @dfn{private}, since the only functions that need to
+use them are the ones in the library.
+
+When writing a library function, you should try to choose names for your
+private variables so that they will not conflict with any variables used by
+either another library function or a user's main program.  For example, a
+name like @samp{i} or @samp{j} is not a good choice, since user programs
+often use variable names like these for their own purposes.
+
+The example programs shown in this chapter all start the names of their
+private variables with an underscore (@samp{_}).  Users generally don't use
+leading underscores in their variable names, so this convention immediately
+decreases the chances that the variable name will be accidentally shared
+with the user's program.
+
+In addition, several of the library functions use a prefix that helps
+indicate what function or set of functions uses the variables. For example,
+@code{_tm_months} in @code{mktime}
+(@pxref{Mktime Function, ,Turning Dates Into Timestamps}), and
+@code{_pw_byname} in the user data base routines
+(@pxref{Passwd Functions, ,Reading the User Database}).
+This convention is recommended, since it even further decreases the chance
+of inadvertent conflict among variable names.
+Note that this convention can be used equally well both for variable names
+and for private function names too.
+
+While I could have re-written all the library routines to use this
+convention, I did not do so, in order to show how my own @code{awk}
+programming style has evolved, and to provide some basis for this
+discussion.
+
+As a final note on variable naming, if a function makes global variables
+available for use by a main program, it is a good convention to start that
+variable's name with a capital letter.
+For example, @code{getopt}'s @code{Opterr} and @code{Optind} variables
+(@pxref{Getopt Function, ,Processing Command Line Options}).
+The leading capital letter indicates that it is global, while the fact that
+the variable name is not all capital letters indicates that the variable is
+not one of @code{awk}'s built-in variables, like @code{FS}.
+
+It is also important that @emph{all} variables in library functions
+that do not need to save state are in fact declared local.  If this is
+not done, the variable could accidentally be used in the user's program,
+leading to bugs that are very difficult to track down.
+
+@example
+function lib_func(x, y,    l1, l2)
+@{
+    @dots{}
+    @var{use variable} some_var  # some_var could be local
+    @dots{}                   # but is not by oversight
+@}
+@end example
+
+@cindex Tcl
+A different convention, common in the Tcl community, is to use a single
+associative array to hold the values needed by the library function(s), or
+``package.''  This significantly decreases the number of actual global names
+in use.  For example, the functions described in
+@ref{Passwd Functions, , Reading the User Database},
+might have used @code{@w{PW_data["inited"]}}, @code{@w{PW_data["total"]}},
+@code{@w{PW_data["count"]}} and @code{@w{PW_data["awklib"]}}, instead of
+@code{@w{_pw_inited}}, @code{@w{_pw_awklib}}, @code{@w{_pw_total}},
+and @code{@w{_pw_count}}.
+
+The conventions presented in this section are exactly that, conventions. You
+are not required to write your programs this way, we merely recommend that
+you do so.
+
+@node Sample Programs, Language History, Library Functions, Top
+@chapter Practical @code{awk} Programs
+
+This chapter presents a potpourri of @code{awk} programs for your reading
+enjoyment.
+@iftex
+There are two sections.  The first presents @code{awk}
+versions of several common POSIX utilities.
+The second is a grab-bag of interesting programs.
+@end iftex
+
+Many of these programs use the library functions presented in
+@ref{Library Functions, ,A Library of @code{awk} Functions}.
+
+@menu
+* Clones::                    Clones of common utilities.
+* Miscellaneous Programs::    Some interesting @code{awk} programs.
+@end menu
+
+@node Clones, Miscellaneous Programs, Sample Programs, Sample Programs
+@section Re-inventing Wheels for Fun and Profit
+
+This section presents a number of POSIX utilities that are implemented in
+@code{awk}.  Re-inventing these programs in @code{awk} is often enjoyable,
+since the algorithms can be very clearly expressed, and usually the code is
+very concise and simple.  This is true because @code{awk} does so much for you.
+
+It should be noted that these programs are not necessarily intended to
+replace the installed versions on your system.  Instead, their
+purpose is to illustrate @code{awk} language programming for ``real world''
+tasks.
+
+The programs are presented in alphabetical order.
+
+@menu
+* Cut Program::             The @code{cut} utility.
+* Egrep Program::           The @code{egrep} utility.
+* Id Program::              The @code{id} utility.
+* Split Program::           The @code{split} utility.
+* Tee Program::             The @code{tee} utility.
+* Uniq Program::            The @code{uniq} utility.
+* Wc Program::              The @code{wc} utility.
+@end menu
+
+@node Cut Program, Egrep Program, Clones, Clones
+@subsection Cutting Out Fields and Columns
+
+@cindex @code{cut} utility
+The @code{cut} utility selects, or ``cuts,'' either characters or fields
+from its standard
+input and sends them to its standard output.  @code{cut} can cut out either
+a list of characters, or a list of fields.  By default, fields are separated
+by tabs, but you may supply a command line option to change the field
+@dfn{delimiter}, i.e.@: the field separator character. @code{cut}'s definition
+of fields is less general than @code{awk}'s.
+
+A common use of @code{cut} might be to pull out just the login name of
+logged-on users from the output of @code{who}.  For example, the following
+pipeline generates a sorted, unique list of the logged on users:
+
+@example
+who | cut -c1-8 | sort | uniq
+@end example
+
+The options for @code{cut} are:
+
+@table @code
+@item -c @var{list}
+Use @var{list} as the list of characters to cut out.  Items within the list
+may be separated by commas, and ranges of characters can be separated with
+dashes.  The list @samp{1-8,15,22-35} specifies characters one through
+eight, 15, and 22 through 35.
+
+@item -f @var{list}
+Use @var{list} as the list of fields to cut out.
+
+@item -d @var{delim}
+Use @var{delim} as the field separator character instead of the tab
+character.
+
+@item -s
+Suppress printing of lines that do not contain the field delimiter.
+@end table
+
+The @code{awk} implementation of @code{cut} uses the @code{getopt} library
+function (@pxref{Getopt Function, ,Processing Command Line Options}),
+and the @code{join} library function
+(@pxref{Join Function, ,Merging an Array Into a String}).
+
+The program begins with a comment describing the options and a @code{usage}
+function which prints out a usage message and exits.  @code{usage} is called
+if invalid arguments are supplied.
+
+@findex cut.awk
+@example
+@c @group
+@c file eg/prog/cut.awk
+# cut.awk --- implement cut in awk
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, Public Domain
+# May 1993
+
+# Options:
+#    -f list        Cut fields
+#    -d c           Field delimiter character
+#    -c list        Cut characters
+#
+#    -s        Suppress lines without the delimiter character
+
+function usage(    e1, e2)
+@{
+    e1 = "usage: cut [-f list] [-d c] [-s] [files...]"
+    e2 = "usage: cut [-c list] [files...]"
+    print e1 > "/dev/stderr"
+    print e2 > "/dev/stderr"
+    exit 1
+@}
+@c endfile
+@c @end group
+@end example
+
+@noindent
+The variables @code{e1} and @code{e2} are used so that the function
+fits nicely on the
+@iftex
+page.
+@end iftex
+@ifinfo
+screen.
+@end ifinfo
+
+Next comes a @code{BEGIN} rule that parses the command line options.
+It sets @code{FS} to a single tab character, since that is @code{cut}'s
+default field separator.  The output field separator is also set to be the
+same as the input field separator.  Then @code{getopt} is used to step
+through the command line options.  One or the other of the variables
+@code{by_fields} or @code{by_chars} is set to true, to indicate that
+processing should be done by fields or by characters respectively.
+When cutting by characters, the output field separator is set to the null
+string.
+
+@example
+@c @group
+@c file eg/prog/cut.awk
+BEGIN    \
+@{
+    FS = "\t"    # default
+    OFS = FS
+    while ((c = getopt(ARGC, ARGV, "sf:c:d:")) != -1) @{
+        if (c == "f") @{
+            by_fields = 1
+            fieldlist = Optarg
+        @} else if (c == "c") @{
+            by_chars = 1
+            fieldlist = Optarg
+            OFS = ""
+        @} else if (c == "d") @{
+            if (length(Optarg) > 1) @{
+                printf("Using first character of %s" \
+                " for delimiter\n", Optarg) > "/dev/stderr"
+                Optarg = substr(Optarg, 1, 1)
+            @}
+            FS = Optarg
+            OFS = FS
+            if (FS == " ")    # defeat awk semantics
+                FS = "[ ]"
+        @} else if (c == "s")
+            suppress++
+        else
+            usage()
+    @}
+
+    for (i = 1; i < Optind; i++)
+        ARGV[i] = ""
+@c endfile
+@c @end group
+@end example
+
+Special care is taken when the field delimiter is a space. Using
+@code{@w{" "}} (a single space) for the value of @code{FS} is
+incorrect---@code{awk} would
+separate fields with runs of spaces and/or tabs, and we want them to be
+separated with individual spaces.  Also, note that after @code{getopt} is
+through, we have to clear out all the elements of @code{ARGV} from one to
+@code{Optind}, so that @code{awk} will not try to process the command line
+options as file names.
+
+After dealing with the command line options, the program verifies that the
+options make sense.  Only one or the other of @samp{-c} and @samp{-f} should
+be used, and both require a field list.  Then either @code{set_fieldlist} or
+@code{set_charlist} is called to pull apart the list of fields or
+characters.
+
+@example
+@c @group
+@c file eg/prog/cut.awk
+    if (by_fields && by_chars)
+        usage()
+
+    if (by_fields == 0 && by_chars == 0)
+        by_fields = 1    # default
+
+    if (fieldlist == "") @{
+        print "cut: needs list for -c or -f" > "/dev/stderr"
+        exit 1
+    @}
+
+@group
+    if (by_fields)
+        set_fieldlist()
+    else
+        set_charlist()
+@}
+@c endfile
+@end group
+@end example
+
+Here is @code{set_fieldlist}.  It first splits the field list apart
+at the commas, into an array.  Then, for each element of the array, it
+looks to see if it is actually a range, and if so splits it apart. The range
+is verified to make sure the first number is smaller than the second.
+Each number in the list is added to the @code{flist} array, which simply
+lists the fields that will be printed.
+Normal field splitting is used.
+The program lets @code{awk}
+handle the job of doing the field splitting.
+
+@example
+@c @group
+@c file eg/prog/cut.awk
+function set_fieldlist(        n, m, i, j, k, f, g)
+@{
+    n = split(fieldlist, f, ",")
+    j = 1    # index in flist
+    for (i = 1; i <= n; i++) @{
+        if (index(f[i], "-") != 0) @{ # a range
+            m = split(f[i], g, "-")
+            if (m != 2 || g[1] >= g[2]) @{
+                printf("bad field list: %s\n",
+                                  f[i]) > "/dev/stderr"
+                exit 1
+            @}
+            for (k = g[1]; k <= g[2]; k++)
+                flist[j++] = k
+        @} else
+            flist[j++] = f[i]
+    @}
+    nfields = j - 1
+@}
+@c endfile
+@c @end group
+@end example
+
+The @code{set_charlist} function is more complicated than @code{set_fieldlist}.
+The idea here is to use @code{gawk}'s @code{FIELDWIDTHS} variable
+(@pxref{Constant Size, ,Reading Fixed-width Data}),
+which describes constant width input.  When using a character list, that is
+exactly what we have.
+
+Setting up @code{FIELDWIDTHS} is more complicated than simply listing the
+fields that need to be printed.  We have to keep track of the fields to be
+printed, and also the intervening characters that have to be skipped.
+For example, suppose you wanted characters one through eight, 15, and
+22 through 35.  You would use @samp{-c 1-8,15,22-35}.  The necessary value
+for @code{FIELDWIDTHS} would be @code{@w{"8 6 1 6 14"}}.  This gives us five
+fields, and what should be printed are @code{$1}, @code{$3}, and @code{$5}.
+The intermediate fields are ``filler,'' stuff in between the desired data.
+
+@code{flist} lists the fields to be printed, and @code{t} tracks the
+complete field list, including filler fields.
+
+@example
+@c @group
+@c file eg/prog/cut.awk
+function set_charlist(    field, i, j, f, g, t,
+                          filler, last, len)
+@{
+    field = 1   # count total fields
+    n = split(fieldlist, f, ",")
+    j = 1       # index in flist
+    for (i = 1; i <= n; i++) @{
+        if (index(f[i], "-") != 0) @{ # range
+            m = split(f[i], g, "-")
+            if (m != 2 || g[1] >= g[2]) @{
+                printf(bad character list: %s\n",
+                               f[i]) > "/dev/stderr"
+                exit 1
+            @}
+            len = g[2] - g[1] + 1
+            if (g[1] > 1)  # compute length of filler
+                filler = g[1] - last - 1
+            else
+                filler = 0
+            if (filler)
+                t[field++] = filler
+            t[field++] = len  # length of field
+            last = g[2]
+            flist[j++] = field - 1
+        @} else @{
+            if (f[i] > 1)
+                filler = f[i] - last - 1
+            else
+                filler = 0
+            if (filler)
+                t[field++] = filler
+            t[field++] = 1
+            last = f[i]
+            flist[j++] = field - 1
+        @}
+    @}
+@group
+    FIELDWIDTHS = join(t, 1, field - 1)
+    nfields = j - 1
+@}
+@end group
+@c endfile
+@end example
+
+Here is the rule that actually processes the data.  If the @samp{-s} option
+was given, then @code{suppress} will be true.  The first @code{if} statement
+makes sure that the input record does have the field separator.  If
+@code{cut} is processing fields, @code{suppress} is true, and the field
+separator character is not in the record, then the record is skipped.
+
+If the record is valid, then at this point, @code{gawk} has split the data
+into fields, either using the character in @code{FS} or using fixed-length
+fields and @code{FIELDWIDTHS}.  The loop goes through the list of fields
+that should be printed.  If the corresponding field has data in it, it is
+printed.  If the next field also has data, then the separator character is
+written out in between the fields.
+
+@c 2e: Could use `index($0, FS) != 0' instead of `$0 !~ FS', below
+
+@example
+@c @group
+@c file eg/prog/cut.awk
+@{
+    if (by_fields && suppress && $0 !~ FS)
+        next
+
+    for (i = 1; i <= nfields; i++) @{
+        if ($flist[i] != "") @{
+            printf "%s", $flist[i]
+            if (i < nfields && $flist[i+1] != "")
+                printf "%s", OFS
+        @}
+    @}
+    print ""
+@}
+@c endfile
+@c @end group
+@end example
+
+This version of @code{cut} relies on @code{gawk}'s @code{FIELDWIDTHS}
+variable to do the character-based cutting.  While it would be possible in
+other @code{awk} implementations to use @code{substr}
+(@pxref{String Functions, ,Built-in Functions for String Manipulation}),
+it would also be extremely painful to do so.
+The @code{FIELDWIDTHS} variable supplies an elegant solution to the problem
+of picking the input line apart by characters.
+
+@node Egrep Program, Id Program, Cut Program, Clones
+@subsection Searching for Regular Expressions in Files
+
+@cindex @code{egrep} utility
+The @code{egrep} utility searches files for patterns.  It uses regular
+expressions that are almost identical to those available in @code{awk}
+(@pxref{Regexp Constants, ,Regular Expression Constants}).  It is used this way:
+
+@example
+egrep @r{[} @var{options} @r{]} '@var{pattern}' @var{files} @dots{}
+@end example
+
+The @var{pattern} is a regexp.
+In typical usage, the regexp is quoted to prevent the shell from expanding
+any of the special characters as file name wildcards.
+Normally, @code{egrep} prints the 
+lines that matched.  If multiple file names are provided on the command
+line, each output line is preceded by the name of the file and a colon.
+
+The options are:
+
+@table @code
+@item -c
+Print out a count of the lines that matched the pattern, instead of the
+lines themselves.
+
+@item -s
+Be silent.  No output is produced, and the exit value indicates whether
+or not the pattern was matched.
+
+@item -v
+Invert the sense of the test. @code{egrep} prints the lines that do
+@emph{not} match the pattern, and exits successfully if the pattern was not
+matched.
+
+@item -i
+Ignore case distinctions in both the pattern and the input data.
+
+@item -l
+Only print the names of the files that matched, not the lines that matched.
+
+@item -e @var{pattern}
+Use @var{pattern} as the regexp to match.  The purpose of the @samp{-e}
+option is to allow patterns that start with a @samp{-}.
+@end table
+
+This version uses the @code{getopt} library function
+(@pxref{Getopt Function, ,Processing Command Line Options}),
+and the file transition library program
+(@pxref{Filetrans Function, ,Noting Data File Boundaries}).
+
+The program begins with a descriptive comment, and then a @code{BEGIN} rule
+that processes the command line arguments with @code{getopt}.  The @samp{-i}
+(ignore case) option is particularly easy with @code{gawk}; we just use the
+@code{IGNORECASE} built in variable
+(@pxref{Built-in Variables}).
+
+@findex egrep.awk
+@example
+@c @group
+@c file eg/prog/egrep.awk
+# egrep.awk --- simulate egrep in awk
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, Public Domain
+# May 1993
+
+# Options:
+#    -c    count of lines
+#    -s    silent - use exit value
+#    -v    invert test, success if no match
+#    -i    ignore case
+#    -l    print filenames only
+#    -e    argument is pattern
+
+BEGIN @{
+    while ((c = getopt(ARGC, ARGV, "ce:svil")) != -1) @{
+        if (c == "c")
+            count_only++
+        else if (c == "s")
+            no_print++
+        else if (c == "v")
+            invert++
+        else if (c == "i")
+            IGNORECASE = 1
+        else if (c == "l")
+            filenames_only++
+        else if (c == "e")
+            pattern = Optarg
+        else
+            usage()
+    @}
+@c endfile
+@c @end group
+@end example
+
+Next comes the code that handles the @code{egrep} specific behavior. If no
+pattern was supplied with @samp{-e}, the first non-option on the command
+line is used.  The @code{awk} command line arguments up to @code{ARGV[Optind]}
+are cleared, so that @code{awk} won't try to process them as files.  If no
+files were specified, the standard input is used, and if multiple files were
+specified, we make sure to note this so that the file names can precede the
+matched lines in the output.
+
+The last two lines are commented out, since they are not needed in
+@code{gawk}.  They should be uncommented if you have to use another version
+of @code{awk}.
+
+@example
+@c @group
+@c file eg/prog/egrep.awk
+    if (pattern == "")
+        pattern = ARGV[Optind++]
+
+    for (i = 1; i < Optind; i++)
+        ARGV[i] = ""
+    if (Optind >= ARGC) @{
+        ARGV[1] = "-"
+        ARGC = 2
+    @} else if (ARGC - Optind > 1)
+        do_filenames++
+
+#    if (IGNORECASE)
+#        pattern = tolower(pattern)
+@}
+@c endfile
+@c @end group
+@end example
+
+The next set of lines should be uncommented if you are not using
+@code{gawk}.  This rule translates all the characters in the input line
+into lower-case if the @samp{-i} option was specified.  The rule is
+commented out since it is not necessary with @code{gawk}.
+@c bug: if a match happens, we output the translated line, not the original
+
+@example
+@c @group
+@c file eg/prog/egrep.awk
+#@{
+#    if (IGNORECASE)
+#        $0 = tolower($0)
+#@}
+@c endfile
+@c @end group
+@end example
+
+The @code{beginfile} function is called by the rule in @file{ftrans.awk}
+when each new file is processed.  In this case, it is very simple; all it
+does is initialize a variable @code{fcount} to zero. @code{fcount} tracks
+how many lines in the current file matched the pattern.
+
+@example
+@c @group
+@c file eg/prog/egrep.awk
+function beginfile(junk)
+@{
+    fcount = 0
+@}
+@c endfile
+@c @end group
+@end example
+
+The @code{endfile} function is called after each file has been processed.
+It is used only when the user wants a count of the number of lines that
+matched.  @code{no_print} will be true only if the exit status is desired.
+@code{count_only} will be true if line counts are desired.  @code{egrep}
+will therefore only print line counts if printing and counting are enabled.
+The output format must be adjusted depending upon the number of files to be
+processed.  Finally, @code{fcount} is added to @code{total}, so that we
+know how many lines altogether matched the pattern.
+
+@example
+@c @group
+@c file eg/prog/egrep.awk
+function endfile(file)
+@{
+    if (! no_print && count_only)
+        if (do_filenames)
+            print file ":" fcount
+        else
+            print fcount
+
+    total += fcount
+@}
+@c endfile
+@c @end group
+@end example
+
+This rule does most of the work of matching lines. The variable
+@code{matches} will be true if the line matched the pattern. If the user
+wants lines that did not match, the sense of the @code{matches} is inverted
+using the @samp{!} operator. @code{fcount} is incremented with the value of
+@code{matches}, which will be either one or zero, depending upon a
+successful or unsuccessful match.  If the line did not match, the
+@code{next} statement just moves on to the next record.
+
+There are several optimizations for performance in the following few lines
+of code. If the user only wants exit status (@code{no_print} is true), and
+we don't have to count lines, then it is enough to know that one line in
+this file matched, and we can skip on to the next file with @code{nextfile}.
+Along similar lines, if we are only printing file names, and we
+don't need to count lines, we can print the file name, and then skip to the
+next file with @code{nextfile}.
+
+Finally, each line is printed, with a leading filename and colon if
+necessary.
+
+@ignore
+2e: note, probably better to recode the last few lines as
+    if (! count_only) @{
+        if (no_print)
+            nextfile
+
+        if (filenames_only) @{
+            print FILENAME
+            nextfile
+        @}
+
+        if (do_filenames)
+            print FILENAME ":" $0
+        else
+            print
+    @}
+@end ignore
+
+@example
+@c @group
+@c file eg/prog/egrep.awk
+@{
+    matches = ($0 ~ pattern)
+    if (invert)
+        matches = ! matches
+
+    fcount += matches    # 1 or 0
+
+    if (! matches)
+        next
+
+    if (no_print && ! count_only)
+        nextfile
+
+    if (filenames_only && ! count_only) @{
+        print FILENAME
+        nextfile
+    @}
+
+    if (do_filenames && ! count_only)
+        print FILENAME ":" $0
+    else if (! count_only)
+        print
+@}
+@c endfile
+@c @end group
+@end example
+
+@c @strong{Exercise}: rearrange the code inside @samp{if (! count_only)}.
+
+The @code{END} rule takes care of producing the correct exit status. If
+there were no matches, the exit status is one, otherwise it is zero.
+
+@example
+@c @group
+@c file eg/prog/egrep.awk
+END    \
+@{
+    if (total == 0)
+        exit 1
+    exit 0
+@}
+@c endfile
+@c @end group
+@end example
+
+The @code{usage} function prints a usage message in case of invalid options
+and then exits.
+
+@example
+@c @group
+@c file eg/prog/egrep.awk
+function usage(    e)
+@{
+    e = "Usage: egrep [-csvil] [-e pat] [files ...]"
+    print e > "/dev/stderr"
+    exit 1
+@}
+@c endfile
+@c @end group
+@end example
+
+The variable @code{e} is used so that the function fits nicely
+on the printed page.
+
+@node Id Program, Split Program, Egrep Program, Clones
+@subsection Printing Out User Information
+
+@cindex @code{id} utility
+The @code{id} utility lists a user's real and effective user-id numbers,
+real and effective group-id numbers, and the user's group set, if any.
+@code{id} will only print the effective user-id and group-id if they are
+different from the real ones.  If possible, @code{id} will also supply the
+corresponding user and group names.  The output might look like this:
+
+@example
+$ id
+@print{} uid=2076(arnold) gid=10(staff) groups=10(staff),4(tty)
+@end example
+
+This information is exactly what is provided by @code{gawk}'s
+@file{/dev/user} special file (@pxref{Special Files, ,Special File Names in @code{gawk}}).
+However, the @code{id} utility provides a more palatable output than just a
+string of numbers.
+
+Here is a simple version of @code{id} written in @code{awk}.
+It uses the user database library functions
+(@pxref{Passwd Functions, ,Reading the User Database}),
+and the group database library functions
+(@pxref{Group Functions, ,Reading the Group Database}).
+
+The program is fairly straightforward.  All the work is done in the
+@code{BEGIN} rule.  The user and group id numbers are obtained from
+@file{/dev/user}.  If there is no support for @file{/dev/user}, the program
+gives up.
+
+The code is repetitive.  The entry in the user database for the real user-id
+number is split into parts at the @samp{:}. The name is the first field.
+Similar code is used for the effective user-id number, and the group
+numbers.
+
+@findex id.awk
+@example
+@c @group
+@c file eg/prog/id.awk
+# id.awk --- implement id in awk
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, Public Domain
+# May 1993
+
+# output is:
+# uid=12(foo) euid=34(bar) gid=3(baz) \
+#             egid=5(blat) groups=9(nine),2(two),1(one)
+
+BEGIN    \
+@{
+    if ((getline < "/dev/user") < 0) @{
+        err = "id: no /dev/user support - cannot run"
+        print err > "/dev/stderr"
+        exit 1
+    @}
+    close("/dev/user")
+
+    uid = $1
+    euid = $2
+    gid = $3
+    egid = $4
+
+    printf("uid=%d", uid)
+    pw = getpwuid(uid)
+@group
+    if (pw != "") @{
+        split(pw, a, ":")
+        printf("(%s)", a[1])
+    @}
+@end group
+
+    if (euid != uid) @{
+        printf(" euid=%d", euid)
+        pw = getpwuid(euid)
+        if (pw != "") @{
+            split(pw, a, ":")
+            printf("(%s)", a[1])
+        @}
+    @}
+
+    printf(" gid=%d", gid)
+    pw = getgrgid(gid)
+    if (pw != "") @{
+        split(pw, a, ":")
+        printf("(%s)", a[1])
+    @}
+
+    if (egid != gid) @{
+        printf(" egid=%d", egid)
+        pw = getgrgid(egid)
+        if (pw != "") @{
+            split(pw, a, ":")
+            printf("(%s)", a[1])
+        @}
+    @}
+
+    if (NF > 4) @{
+        printf(" groups=");
+        for (i = 5; i <= NF; i++) @{
+            printf("%d", $i)
+            pw = getgrgid($i)
+            if (pw != "") @{
+                split(pw, a, ":")
+                printf("(%s)", a[1])
+            @}
+            if (i < NF)
+                printf(",")
+        @}
+    @}
+    print ""
+@}
+@c endfile
+@c @end group
+@end example
+
+@c exercise!!!
+@ignore
+The POSIX version of @code{id} takes arguments that control which
+information is printed.  Modify this version to accept the same
+arguments and perform in the same way.
+@end ignore
+
+@node Split Program, Tee Program, Id Program, Clones
+@subsection Splitting a Large File Into Pieces
+
+@cindex @code{split} utility
+The @code{split} program splits large text files into smaller pieces. By default,
+the output files are named @file{xaa}, @file{xab}, and so on. Each file has
+1000 lines in it, with the likely exception of the last file. To change the
+number of lines in each file, you supply a number on the command line
+preceded with a minus, e.g., @samp{-500} for files with 500 lines in them
+instead of 1000.  To change the name of the output files to something like
+@file{myfileaa}, @file{myfileab}, and so on, you supply an additional
+argument that specifies the filename.
+
+Here is a version of @code{split} in @code{awk}. It uses the @code{ord} and
+@code{chr} functions presented in
+@ref{Ordinal Functions, ,Translating Between Characters and Numbers}.
+
+The program first sets its defaults, and then tests to make sure there are
+not too many arguments.  It then looks at each argument in turn.  The
+first argument could be a minus followed by a number. If it is, this happens
+to look like a negative number, so it is made positive, and that is the
+count of lines.  The data file name is skipped over, and the final argument
+is used as the prefix for the output file names.
+
+@findex split.awk
+@example
+@c @group
+@c file eg/prog/split.awk
+# split.awk --- do split in awk
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, Public Domain
+# May 1993
+
+# usage: split [-num] [file] [outname]
+
+BEGIN    \
+@{
+    outfile = "x"    # default
+    count = 1000
+    if (ARGC > 4)
+        usage()
+
+    i = 1
+    if (ARGV[i] ~ /^-[0-9]+$/) @{
+        count = -ARGV[i]
+        ARGV[i] = ""
+        i++
+    @}
+    # test argv in case reading from stdin instead of file
+    if (i in ARGV)
+        i++    # skip data file name
+    if (i in ARGV) @{
+        outfile = ARGV[i]
+        ARGV[i] = ""
+    @}
+
+    s1 = s2 = "a"
+    out = (outfile s1 s2)
+@}
+@c endfile
+@c @end group
+@end example
+
+The next rule does most of the work. @code{tcount} (temporary count) tracks
+how many lines have been printed to the output file so far. If it is greater
+than @code{count}, it is time to close the current file and start a new one.
+@code{s1} and @code{s2} track the current suffixes for the file name. If
+they are both @samp{z}, the file is just too big.  Otherwise, @code{s1}
+moves to the next letter in the alphabet and @code{s2} starts over again at
+@samp{a}.
+
+@example
+@c @group
+@c file eg/prog/split.awk
+@{
+    if (++tcount > count) @{
+        close(out)
+        if (s2 == "z") @{
+            if (s1 == "z") @{
+                printf("split: %s is too large to split\n", \
+                       FILENAME) > "/dev/stderr"
+                exit 1
+            @}
+            s1 = chr(ord(s1) + 1)
+            s2 = "a"
+        @} else
+            s2 = chr(ord(s2) + 1)
+        out = (outfile s1 s2)
+        tcount = 1
+    @}
+    print > out
+@}
+@c endfile
+@c @end group
+@end example
+
+The @code{usage} function simply prints an error message and exits.
+
+@example
+@c @group
+@c file eg/prog/split.awk
+function usage(   e)
+@{
+    e = "usage: split [-num] [file] [outname]"
+    print e > "/dev/stderr"
+    exit 1
+@}
+@c endfile
+@c @end group
+@end example
+
+@noindent
+The variable @code{e} is used so that the function
+fits nicely on the
+@iftex
+page.
+@end iftex
+@ifinfo
+screen.
+@end ifinfo
+
+This program is a bit sloppy; it relies on @code{awk} to close the last file
+for it automatically, instead of doing it in an @code{END} rule.
+
+@node Tee Program, Uniq Program, Split Program, Clones
+@subsection Duplicating Output Into Multiple Files
+
+@cindex @code{tee} utility
+The @code{tee} program is known as a ``pipe fitting.''  @code{tee} copies
+its standard input to its standard output, and also duplicates it to the
+files named on the command line.  Its usage is:
+
+@example
+tee @r{[}-a@r{]} file @dots{}
+@end example
+
+The @samp{-a} option tells @code{tee} to append to the named files, instead of
+truncating them and starting over.
+
+The @code{BEGIN} rule first makes a copy of all the command line arguments,
+into an array named @code{copy}.
+@code{ARGV[0]} is not copied, since it is not needed.
+@code{tee} cannot use @code{ARGV} directly, since @code{awk} will attempt to
+process each file named in @code{ARGV} as input data.
+
+If the first argument is @samp{-a}, then the flag variable
+@code{append} is set to true, and both @code{ARGV[1]} and
+@code{copy[1]} are deleted. If @code{ARGC} is less than two, then no file
+names were supplied, and @code{tee} prints a usage message and exits.
+Finally, @code{awk} is forced to read the standard input by setting
+@code{ARGV[1]} to @code{"-"}, and @code{ARGC} to two.
+
+@c 2e: the `ARGC--' in the `if (ARGV[1] == "-a")' isn't needed.
+
+@findex tee.awk
+@example
+@c @group
+@c file eg/prog/tee.awk
+# tee.awk --- tee in awk
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, Public Domain
+# May 1993
+# Revised December 1995
+
+BEGIN    \
+@{
+    for (i = 1; i < ARGC; i++)
+        copy[i] = ARGV[i]
+
+    if (ARGV[1] == "-a") @{
+        append = 1
+        delete ARGV[1]
+        delete copy[1]
+        ARGC--
+    @}
+    if (ARGC < 2) @{
+        print "usage: tee [-a] file ..." > "/dev/stderr"
+        exit 1
+    @}
+    ARGV[1] = "-"
+    ARGC = 2
+@}
+@c endfile
+@c @end group
+@end example
+
+The single rule does all the work.  Since there is no pattern, it is
+executed for each line of input.  The body of the rule simply prints the
+line into each file on the command line, and then to the standard output.
+
+@example
+@group
+@c file eg/prog/tee.awk
+@{
+    # moving the if outside the loop makes it run faster
+    if (append)
+        for (i in copy)
+            print >> copy[i]
+    else
+        for (i in copy)
+            print > copy[i]
+    print
+@}
+@c endfile
+@end group
+@end example
+
+It would have been possible to code the loop this way:
+
+@example
+for (i in copy)
+    if (append)
+        print >> copy[i]
+    else
+        print > copy[i]
+@end example
+
+@noindent
+This is more concise, but it is also less efficient.  The @samp{if} is
+tested for each record and for each output file.  By duplicating the loop
+body, the @samp{if} is only tested once for each input record.  If there are
+@var{N} input records and @var{M} input files, the first method only
+executes @var{N} @samp{if} statements, while the second would execute
+@var{N}@code{*}@var{M} @samp{if} statements.
+
+Finally, the @code{END} rule cleans up, by closing all the output files.
+
+@example
+@c @group
+@c file eg/prog/tee.awk
+END    \
+@{
+    for (i in copy)
+        close(copy[i])
+@}
+@c endfile
+@c @end group
+@end example
+
+@node Uniq Program, Wc Program, Tee Program, Clones
+@subsection Printing Non-duplicated Lines of Text
+
+@cindex @code{uniq} utility
+The @code{uniq} utility reads sorted lines of data on its standard input,
+and (by default) removes duplicate lines.  In other words, only unique lines
+are printed, hence the name.  @code{uniq} has a number of options. The usage is:
+
+@example
+uniq @r{[}-udc @r{[}-@var{n}@r{]]} @r{[}+@var{n}@r{]} @r{[} @var{input file} @r{[} @var{output file} @r{]]}
+@end example
+
+The option meanings are:
+
+@table @code
+@item -d
+Only print repeated lines.
+
+@item -u
+Only print non-repeated lines.
+
+@item -c
+Count lines. This option overrides @samp{-d} and @samp{-u}.  Both repeated
+and non-repeated lines are counted.
+
+@item -@var{n}
+Skip @var{n} fields before comparing lines.  The definition of fields is the
+same as @code{awk}'s default: non-whitespace characters separated by runs of
+spaces and/or tabs.
+
+@item +@var{n}
+Skip @var{n} characters before comparing lines.  Any fields specified with
+@samp{-@var{n}} are skipped first.
+
+@item @var{input file}
+Data is read from the input file named on the command line, instead of from
+the standard input.
+
+@item @var{output file}
+The generated output is sent to the named output file, instead of to the
+standard output.
+@end table
+
+Normally @code{uniq} behaves as if both the @samp{-d} and @samp{-u} options
+had been provided.
+
+Here is an @code{awk} implementation of @code{uniq}. It uses the
+@code{getopt} library function
+(@pxref{Getopt Function, ,Processing Command Line Options}),
+and the @code{join} library function
+(@pxref{Join Function, ,Merging an Array Into a String}).
+
+The program begins with a @code{usage} function and then a brief outline of
+the options and their meanings in a comment.
+
+The @code{BEGIN} rule deals with the command line arguments and options. It
+uses a trick to get @code{getopt} to handle options of the form @samp{-25},
+treating such an option as the option letter @samp{2} with an argument of
+@samp{5}. If indeed two or more digits were supplied (@code{Optarg} looks
+like a number), @code{Optarg} is
+concatenated with the option digit, and then result is added to zero to make
+it into a number.  If there is only one digit in the option, then
+@code{Optarg} is not needed, and @code{Optind} must be decremented so that
+@code{getopt} will process it next time.  This code is admittedly a bit
+tricky.
+
+If no options were supplied, then the default is taken, to print both
+repeated and non-repeated lines.  The output file, if provided, is assigned
+to @code{outputfile}.  Earlier, @code{outputfile} was initialized to the
+standard output, @file{/dev/stdout}.
+
+@findex uniq.awk
+@example
+@c @group
+@c file eg/prog/uniq.awk
+# uniq.awk --- do uniq in awk
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, Public Domain
+# May 1993
+
+function usage(    e)
+@{
+    e = "Usage: uniq [-udc [-n]] [+n] [ in [ out ]]"
+    print e > "/dev/stderr"
+    exit 1
+@}
+
+# -c    count lines. overrides -d and -u
+# -d    only repeated lines
+# -u    only non-repeated lines
+# -n    skip n fields
+# +n    skip n characters, skip fields first
+
+BEGIN    \
+@{
+    count = 1
+    outputfile = "/dev/stdout"
+    opts = "udc0:1:2:3:4:5:6:7:8:9:"
+    while ((c = getopt(ARGC, ARGV, opts)) != -1) @{
+        if (c == "u")
+            non_repeated_only++
+        else if (c == "d")
+            repeated_only++
+        else if (c == "c")
+            do_count++
+        else if (index("0123456789", c) != 0) @{
+            # getopt requires args to options
+            # this messes us up for things like -5
+            if (Optarg ~ /^[0-9]+$/)
+                fcount = (c Optarg) + 0
+            else @{
+                fcount = c + 0
+                Optind--
+            @}
+        @} else
+            usage()
+    @}
+
+    if (ARGV[Optind] ~ /^\+[0-9]+$/) @{
+        charcount = substr(ARGV[Optind], 2) + 0
+        Optind++
+    @}
+
+    for (i = 1; i < Optind; i++)
+        ARGV[i] = ""
+
+    if (repeated_only == 0 && non_repeated_only == 0)
+        repeated_only = non_repeated_only = 1
+
+    if (ARGC - Optind == 2) @{
+        outputfile = ARGV[ARGC - 1]
+        ARGV[ARGC - 1] = ""
+    @}
+@}
+@c endfile
+@c @end group
+@end example
+
+The following function, @code{are_equal}, compares the current line,
+@code{$0}, to the
+previous line, @code{last}.  It handles skipping fields and characters.
+
+If no field count and no character count were specified, @code{are_equal}
+simply returns one or zero depending upon the result of a simple string
+comparison of @code{last} and @code{$0}.  Otherwise, things get more
+complicated.
+
+If fields have to be skipped, each line is broken into an array using
+@code{split}
+(@pxref{String Functions, ,Built-in Functions for String Manipulation}),
+and then the desired fields are joined back into a line using @code{join}.
+The joined lines are stored in @code{clast} and @code{cline}.
+If no fields are skipped, @code{clast} and @code{cline} are set to
+@code{last} and @code{$0} respectively.
+
+Finally, if characters are skipped, @code{substr} is used to strip off the
+leading @code{charcount} characters in @code{clast} and @code{cline}.  The
+two strings are then compared, and @code{are_equal} returns the result.
+
+@example
+@c @group
+@c file eg/prog/uniq.awk
+function are_equal(    n, m, clast, cline, alast, aline)
+@{
+    if (fcount == 0 && charcount == 0)
+        return (last == $0)
+
+    if (fcount > 0) @{
+        n = split(last, alast)
+        m = split($0, aline)
+        clast = join(alast, fcount+1, n)
+        cline = join(aline, fcount+1, m)
+    @} else @{
+        clast = last
+        cline = $0
+    @}
+    if (charcount) @{
+        clast = substr(clast, charcount + 1)
+        cline = substr(cline, charcount + 1)
+    @}
+
+    return (clast == cline)
+@}
+@c endfile
+@c @end group
+@end example
+
+The following two rules are the body of the program.  The first one is
+executed only for the very first line of data.  It sets @code{last} equal to
+@code{$0}, so that subsequent lines of text have something to be compared to.
+
+The second rule does the work. The variable @code{equal} will be one or zero
+depending upon the results of @code{are_equal}'s comparison. If @code{uniq}
+is counting repeated lines, then the @code{count} variable is incremented if
+the lines are equal. Otherwise the line is printed and @code{count} is
+reset, since the two lines are not equal.
+
+If @code{uniq} is not counting, @code{count} is incremented if the lines are
+equal. Otherwise, if @code{uniq} is counting repeated lines, and more than
+one line has been seen, or if @code{uniq} is counting non-repeated lines,
+and only one line has been seen, then the line is printed, and @code{count}
+is reset.
+
+Finally, similar logic is used in the @code{END} rule to print the final
+line of input data.
+
+@example
+@c @group
+@c file eg/prog/uniq.awk
+@group
+NR == 1 @{
+    last = $0
+    next
+@}
+@end group
+    
+@{
+    equal = are_equal()
+
+    if (do_count) @{    # overrides -d and -u
+        if (equal)
+            count++
+        else @{
+            printf("%4d %s\n", count, last) > outputfile
+            last = $0
+            count = 1    # reset
+        @}
+        next
+    @}
+
+    if (equal)
+        count++
+    else @{
+        if ((repeated_only && count > 1) ||
+            (non_repeated_only && count == 1))
+                print last > outputfile
+        last = $0
+        count = 1
+    @}
+@}
+
+@group
+END @{
+    if (do_count)
+        printf("%4d %s\n", count, last) > outputfile
+    else if ((repeated_only && count > 1) ||
+            (non_repeated_only && count == 1))
+        print last > outputfile
+@}
+@end group
+@c endfile
+@c @end group
+@end example
+
+@node Wc Program,  , Uniq Program, Clones
+@subsection Counting Things
+
+@cindex @code{wc} utility
+The @code{wc} (word count) utility counts lines, words, and characters in
+one or more input files. Its usage is:
+
+@example
+wc @r{[}-lwc@r{]} @r{[} @var{files} @dots{} @r{]}
+@end example
+
+If no files are specified on the command line, @code{wc} reads its standard
+input. If there are multiple files, it will also print total counts for all
+the files.  The options and their meanings are:
+
+@table @code
+@item -l
+Only count lines.
+
+@item -w
+Only count words.
+A ``word'' is a contiguous sequence of non-whitespace characters, separated
+by spaces and/or tabs.  Happily, this is the normal way @code{awk} separates
+fields in its input data.
+
+@item -c
+Only count characters.
+@end table
+
+Implementing @code{wc} in @code{awk} is particularly elegant, since
+@code{awk} does a lot of the work for us; it splits lines into words (i.e.@:
+fields) and counts them, it counts lines (i.e.@: records) for us, and it can
+easily tell us how long a line is.
+
+This version uses the @code{getopt} library function
+(@pxref{Getopt Function, ,Processing Command Line Options}),
+and the file transition functions
+(@pxref{Filetrans Function, ,Noting Data File Boundaries}).
+
+This version has one major difference from traditional versions of @code{wc}.
+Our version always prints the counts in the order lines, words,
+and characters.  Traditional versions note the order of the @samp{-l},
+@samp{-w}, and @samp{-c} options on the command line, and print the counts
+in that order.
+
+The @code{BEGIN} rule does the argument processing.
+The variable @code{print_total} will
+be true if more than one file was named on the command line.
+
+@findex wc.awk
+@example
+@c @group
+@c file eg/prog/wc.awk
+# wc.awk --- count lines, words, characters
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, Public Domain
+# May 1993
+
+# Options:
+#    -l    only count lines
+#    -w    only count words
+#    -c    only count characters
+#
+# Default is to count lines, words, characters
+
+BEGIN @{
+    # let getopt print a message about
+    # invalid options. we ignore them
+    while ((c = getopt(ARGC, ARGV, "lwc")) != -1) @{
+        if (c == "l")
+            do_lines = 1
+        else if (c == "w")
+            do_words = 1
+        else if (c == "c")
+            do_chars = 1
+    @}
+    for (i = 1; i < Optind; i++)
+        ARGV[i] = ""
+
+    # if no options, do all
+    if (! do_lines && ! do_words && ! do_chars)
+        do_lines = do_words = do_chars = 1
+
+    print_total = (ARC - i > 2)
+@}
+@c endfile
+@c @end group
+@end example
+
+The @code{beginfile} function is simple; it just resets the counts of lines,
+words, and characters to zero, and saves the current file name in
+@code{fname}.
+
+The @code{endfile} function adds the current file's numbers to the running
+totals of lines, words, and characters.  It then prints out those numbers
+for the file that was just read. It relies on @code{beginfile} to reset the
+numbers for the following data file.
+
+@example
+@c @group
+@c file eg/prog/wc.awk
+function beginfile(file)
+@{
+    chars = lines = words = 0
+    fname = FILENAME
+@}
+
+function endfile(file)
+@{
+    tchars += chars
+    tlines += lines
+    twords += words
+@group
+    if (do_lines)
+        printf "\t%d", lines
+@end group
+    if (do_words)
+        printf "\t%d", words
+    if (do_chars)
+        printf "\t%d", chars
+    printf "\t%s\n", fname
+@}
+@c endfile
+@c @end group
+@end example
+
+There is one rule that is executed for each line. It adds the length of the
+record to @code{chars}.  It has to add one, since the newline character
+separating records (the value of @code{RS}) is not part of the record
+itself.  @code{lines} is incremented for each line read, and @code{words} is
+incremented by the value of @code{NF}, the number of ``words'' on this
+line.@footnote{Examine the code in
+@ref{Filetrans Function, ,Noting Data File Boundaries}.
+Why must @code{wc} use a separate @code{lines} variable, instead of using
+the value of @code{FNR} in @code{endfile}?}
+
+Finally, the @code{END} rule simply prints the totals for all the files.
+
+@example
+@c @group
+@c file eg/prog/wc.awk
+# do per line
+@{
+    chars += length($0) + 1    # get newline
+    lines++
+    words += NF
+@}
+
+END @{
+    if (print_total) @{
+        if (do_lines)
+            printf "\t%d", tlines
+        if (do_words)
+            printf "\t%d", twords
+        if (do_chars)
+            printf "\t%d", tchars
+        print "\ttotal"
+    @}
+@}
+@c endfile
+@c @end group
+@end example
+
+@node Miscellaneous Programs,  , Clones, Sample Programs
+@section A Grab Bag of @code{awk} Programs
+
+This section is a large ``grab bag'' of miscellaneous programs.
+We hope you find them both interesting and enjoyable.
+
+@menu
+* Dupword Program::         Finding duplicated words in a document.
+* Alarm Program::           An alarm clock.
+* Translate Program::       A program similar to the @code{tr} utility.
+* Labels Program::          Printing mailing labels.
+* Word Sorting::            A program to produce a word usage count.
+* History Sorting::         Eliminating duplicate entries from a history
+                            file.
+* Extract Program::         Pulling out programs from Texinfo source
+                            files.
+* Simple Sed::              A Simple Stream Editor.
+* Igawk Program::           A wrapper for @code{awk} that includes files.
+@end menu
+
+@node Dupword Program, Alarm Program, Miscellaneous Programs, Miscellaneous Programs
+@subsection Finding Duplicated Words in a Document
+
+A common error when writing large amounts of prose is to accidentally
+duplicate words.  Often you will see this in text as something like ``the
+the program does the following @dots{}.''  When the text is on-line, often
+the duplicated words occur at the end of one line and the beginning of
+another, making them very difficult to spot.
+@c as here!
+
+This program, @file{dupword.awk}, scans through a file one line at a time,
+and looks for adjacent occurrences of the same word.  It also saves the last
+word on a line (in the variable @code{prev}) for comparison with the first
+word on the next line.
+
+The first two statements make sure that the line is all lower-case, so that,
+for example,
+``The'' and ``the'' compare equal to each other.  The second statement
+removes all non-alphanumeric and non-whitespace characters from the line, so
+that punctuation does not affect the comparison either.  This sometimes
+leads to reports of duplicated words that really are different, but this is
+unusual.
+
+@findex dupword.awk
+@example
+@group
+@c file eg/prog/dupword.awk
+# dupword --- find duplicate words in text
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, Public Domain
+# December 1991
+
+@{
+    $0 = tolower($0)
+    gsub(/[^A-Za-z0-9 \t]/, "");
+    if ($1 == prev)
+        printf("%s:%d: duplicate %s\n",
+            FILENAME, FNR, $1)
+    for (i = 2; i <= NF; i++)
+        if ($i == $(i-1))
+            printf("%s:%d: duplicate %s\n",
+                FILENAME, FNR, $i)
+    prev = $NF
+@}
+@c endfile
+@end group
+@end example
+
+@node Alarm Program, Translate Program, Dupword Program, Miscellaneous Programs
+@subsection An Alarm Clock Program
+
+The following program is a simple ``alarm clock'' program.
+You give it a time of day, and an optional message.  At the given time,
+it prints the message on the standard output. In addition, you can give it
+the number of times to repeat the message, and also a delay between
+repetitions.
+
+This program uses the @code{gettimeofday} function from
+@ref{Gettimeofday Function, ,Managing the Time of Day}.
+
+All the work is done in the @code{BEGIN} rule.  The first part is argument
+checking and setting of defaults; the delay, the count, and the message to
+print.  If the user supplied a message, but it does not contain the ASCII BEL
+character (known as the ``alert'' character, @samp{\a}), then it is added to
+the message.  (On many systems, printing the ASCII BEL generates some sort
+of audible alert. Thus, when the alarm goes off, the system calls attention
+to itself, in case the user is not looking at their computer or terminal.)
+
+@findex alarm.awk
+@example
+@c @group
+@c file eg/prog/alarm.awk
+# alarm --- set an alarm
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, Public Domain
+# May 1993
+
+# usage: alarm time [ "message" [ count [ delay ] ] ]
+
+BEGIN    \
+@{
+    # Initial argument sanity checking
+    usage1 = "usage: alarm time ['message' [count [delay]]]"
+    usage2 = sprintf("\t(%s) time ::= hh:mm", ARGV[1])
+
+    if (ARGC < 2) @{
+        print usage > "/dev/stderr"
+        exit 1
+    @} else if (ARGC == 5) @{
+        delay = ARGV[4] + 0
+        count = ARGV[3] + 0
+        message = ARGV[2]
+    @} else if (ARGC == 4) @{
+        count = ARGV[3] + 0
+        message = ARGV[2]
+    @} else if (ARGC == 3) @{
+        message = ARGV[2]
+    @} else if (ARGV[1] !~ /[0-9]?[0-9]:[0-9][0-9]/) @{
+        print usage1 > "/dev/stderr"
+        print usage2 > "/dev/stderr"
+        exit 1
+    @}
+
+    # set defaults for once we reach the desired time
+    if (delay == 0)
+        delay = 180    # 3 minutes
+    if (count == 0)
+        count = 5
+@group
+    if (message == "")
+        message = sprintf("\aIt is now %s!\a", ARGV[1])
+    else if (index(message, "\a") == 0)
+        message = "\a" message "\a"
+@end group
+@c endfile
+@end example
+
+The next section of code turns the alarm time into hours and minutes,
+and converts it if necessary to a 24-hour clock.  Then it turns that
+time into a count of the seconds since midnight.  Next it turns the current
+time into a count of seconds since midnight.  The difference between the two
+is how long to wait before setting off the alarm.
+
+@example
+@c @group
+@c file eg/prog/alarm.awk
+    # split up dest time
+    split(ARGV[1], atime, ":")
+    hour = atime[1] + 0    # force numeric
+    minute = atime[2] + 0  # force numeric
+
+    # get current broken down time
+    gettimeofday(now)
+
+    # if time given is 12-hour hours and it's after that
+    # hour, e.g., `alarm 5:30' at 9 a.m. means 5:30 p.m.,
+    # then add 12 to real hour
+    if (hour < 12 && now["hour"] > hour)
+        hour += 12
+
+    # set target time in seconds since midnight
+    target = (hour * 60 * 60) + (minute * 60)
+
+    # get current time in seconds since midnight
+    current = (now["hour"] * 60 * 60) + \
+               (now["minute"] * 60) + now["second"]
+
+    # how long to sleep for
+    naptime = target - current
+    if (naptime <= 0) @{
+        print "time is in the past!" > "/dev/stderr"
+        exit 1
+    @}
+@c endfile
+@c @end group
+@end example
+
+Finally, the program uses the @code{system} function
+(@pxref{I/O Functions, ,Built-in Functions for Input/Output})
+to call the @code{sleep} utility.  The @code{sleep} utility simply pauses
+for the given number of seconds.  If the exit status is not zero,
+the program assumes that @code{sleep} was interrupted, and exits. If
+@code{sleep} exited with an OK status (zero), then the program prints the
+message in a loop, again using @code{sleep} to delay for however many
+seconds are necessary.
+
+@example
+@c @group
+@c file eg/prog/alarm.awk
+    # zzzzzz..... go away if interrupted
+    if (system(sprintf("sleep %d", naptime)) != 0)
+        exit 1
+
+    # time to notify!
+    command = sprintf("sleep %d", delay)
+    for (i = 1; i <= count; i++) @{
+        print message
+        # if sleep command interrupted, go away
+        if (system(command) != 0)
+            break
+    @}
+
+    exit 0
+@}
+@c endfile
+@c @end group
+@end example
+
+@node Translate Program, Labels Program, Alarm Program, Miscellaneous Programs
+@subsection Transliterating Characters
+
+The system @code{tr} utility transliterates characters.  For example, it is
+often used to map upper-case letters into lower-case, for further
+processing.
+
+@example
+@var{generate data} | tr '[A-Z]' '[a-z]' | @var{process data} @dots{}
+@end example
+
+You give @code{tr} two lists of characters enclosed in square brackets.
+Usually, the lists are quoted to keep the shell from attempting to do a
+filename expansion.@footnote{On older, non-POSIX systems, @code{tr} often
+does not require that the lists be enclosed in square brackets and quoted.
+This is a feature.}  When processing the input, the
+first character in the first list is replaced with the first character in the
+second list, the second character in the first list is replaced with the
+second character in the second list, and so on.
+If there are more characters in the ``from'' list than in the ``to'' list,
+the last character of the ``to'' list is used for the remaining characters
+in the ``from'' list.
+
+Some time ago,
+@c early or mid-1989!
+a user proposed to us that we add a transliteration function to @code{gawk}.
+Being opposed to ``creeping featurism,'' I wrote the following program to
+prove that character transliteration could be done with a user-level
+function.  This program is not as complete as the system @code{tr} utility,
+but it will do most of the job.
+
+The @code{translate} program demonstrates one of the few weaknesses of
+standard
+@code{awk}: dealing with individual characters is very painful, requiring
+repeated use of the @code{substr}, @code{index}, and @code{gsub} built-in
+functions
+(@pxref{String Functions, ,Built-in Functions for String Manipulation}).@footnote{This
+program was written before @code{gawk} acquired the ability to
+split each character in a string into separate array elements.
+How might this ability simplify the program?}
+
+There are two functions.  The first, @code{stranslate}, takes three
+arguments.
+
+@table @code
+@item from
+A list of characters to translate from.
+
+@item to
+A list of characters to translate to.
+
+@item target
+The string to do the translation on.
+@end table
+
+Associative arrays make the translation part fairly easy. @code{t_ar} holds
+the ``to'' characters, indexed by the ``from'' characters.  Then a simple
+loop goes through @code{from}, one character at a time.  For each character
+in @code{from}, if the character appears in @code{target}, @code{gsub}
+is used to change it to the corresponding @code{to} character.
+
+The @code{translate} function simply calls @code{stranslate} using @code{$0}
+as the target.  The main program sets two global variables, @code{FROM} and
+@code{TO}, from the command line, and then changes @code{ARGV} so that
+@code{awk} will read from the standard input.
+
+Finally, the processing rule simply calls @code{translate} for each record.
+
+@findex translate.awk
+@example
+@c @group
+@c file eg/prog/translate.awk
+# translate --- do tr like stuff
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, Public Domain
+# August 1989
+
+# bugs: does not handle things like: tr A-Z a-z, it has
+# to be spelled out. However, if `to' is shorter than `from',
+# the last character in `to' is used for the rest of `from'.
+
+function stranslate(from, to, target,     lf, lt, t_ar, i, c)
+@{
+    lf = length(from)
+    lt = length(to)
+    for (i = 1; i <= lt; i++)
+        t_ar[substr(from, i, 1)] = substr(to, i, 1)
+    if (lt < lf)
+        for (; i <= lf; i++)
+            t_ar[substr(from, i, 1)] = substr(to, lt, 1)
+    for (i = 1; i <= lf; i++) @{
+        c = substr(from, i, 1)
+        if (index(target, c) > 0)
+            gsub(c, t_ar[c], target)
+    @}
+    return target
+@}
+
+@group
+function translate(from, to)
+@{
+    return $0 = stranslate(from, to, $0)
+@}
+@end group
+
+# main program
+BEGIN @{
+    if (ARGC < 3) @{
+        print "usage: translate from to" > "/dev/stderr"
+        exit
+    @}
+    FROM = ARGV[1]
+    TO = ARGV[2]
+    ARGC = 2
+    ARGV[1] = "-"
+@}
+
+@{
+    translate(FROM, TO)
+    print
+@}
+@c endfile
+@c @end group
+@end example
+
+While it is possible to do character transliteration in a user-level
+function, it is not necessarily efficient, and we started to consider adding
+a built-in function.  However, shortly after writing this program, we learned
+that the System V Release 4 @code{awk} had added the @code{toupper} and
+@code{tolower} functions.  These functions handle the vast majority of the
+cases where character transliteration is necessary, and so we chose to
+simply add those functions to @code{gawk} as well, and then leave well
+enough alone.
+
+An obvious improvement to this program would be to set up the
+@code{t_ar} array only once, in a @code{BEGIN} rule. However, this
+assumes that the ``from'' and ``to'' lists
+will never change throughout the lifetime of the program.
+
+@node Labels Program, Word Sorting, Translate Program, Miscellaneous Programs
+@subsection Printing Mailing Labels
+
+Here is a ``real world''@footnote{``Real world'' is defined as
+``a program actually used to get something done.''}
+program.  This script reads lists of names and
+addresses, and generates mailing labels.  Each page of labels has 20 labels
+on it, two across and ten down.  The addresses are guaranteed to be no more
+than five lines of data.  Each address is separated from the next by a blank
+line.
+
+The basic idea is to read 20 labels worth of data.  Each line of each label
+is stored in the @code{line} array.  The single rule takes care of filling
+the @code{line} array and printing the page when 20 labels have been read.
+
+The @code{BEGIN} rule simply sets @code{RS} to the empty string, so that
+@code{awk} will split records at blank lines
+(@pxref{Records, ,How Input is Split into Records}).
+It sets @code{MAXLINES} to 100, since @code{MAXLINE} is the maximum number
+of lines on the page (20 * 5 = 100).
+
+Most of the work is done in the @code{printpage} function.
+The label lines are stored sequentially in the @code{line} array.  But they
+have to be printed horizontally; @code{line[1]} next to @code{line[6]},
+@code{line[2]} next to @code{line[7]}, and so on.  Two loops are used to
+accomplish this.  The outer loop, controlled by @code{i}, steps through
+every 10 lines of data; this is each row of labels.  The inner loop,
+controlled by @code{j}, goes through the lines within the row.
+As @code{j} goes from zero to four, @samp{i+j} is the @code{j}'th line in
+the row, and @samp{i+j+5} is the entry next to it.  The output ends up
+looking something like this:
+
+@example
+line 1          line 6
+line 2          line 7
+line 3          line 8
+line 4          line 9
+line 5          line 10
+@end example
+
+As a final note, at lines 21 and 61, an extra blank line is printed, to keep
+the output lined up on the labels.  This is dependent on the particular
+brand of labels in use when the program was written.  You will also note
+that there are two blank lines at the top and two blank lines at the bottom.
+
+The @code{END} rule arranges to flush the final page of labels; there may
+not have been an even multiple of 20 labels in the data.
+
+@findex labels.awk
+@example
+@c @group
+@c file eg/prog/labels.awk
+# labels.awk
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, Public Domain
+# June 1992
+
+# Program to print labels.  Each label is 5 lines of data
+# that may have blank lines.  The label sheets have 2
+# blank lines at the top and 2 at the bottom.
+
+BEGIN    @{ RS = "" ; MAXLINES = 100 @}
+
+function printpage(    i, j)
+@{
+    if (Nlines <= 0)
+        return
+
+    printf "\n\n"        # header
+
+    for (i = 1; i <= Nlines; i += 10) @{
+        if (i == 21 || i == 61)
+            print ""
+        for (j = 0; j < 5; j++) @{
+            if (i + j > MAXLINES)
+                break
+            printf "   %-41s %s\n", line[i+j], line[i+j+5]
+        @}
+        print ""
+    @}
+
+    printf "\n\n"        # footer
+
+    for (i in line)
+        line[i] = ""
+@}
+
+# main rule
+@{
+    if (Count >= 20) @{
+        printpage()
+        Count = 0
+        Nlines = 0
+    @}
+    n = split($0, a, "\n")
+    for (i = 1; i <= n; i++)
+        line[++Nlines] = a[i]
+    for (; i <= 5; i++)
+        line[++Nlines] = ""
+    Count++
+@}
+
+END    \
+@{
+    printpage()
+@}
+@c endfile
+@c @end group
+@end example
+
+@node Word Sorting, History Sorting, Labels Program, Miscellaneous Programs
+@subsection Generating Word Usage Counts
+
+The following @code{awk} program prints
+the number of occurrences of each word in its input.  It illustrates the
+associative nature of @code{awk} arrays by using strings as subscripts.  It
+also demonstrates the @samp{for @var{x} in @var{array}} construction.
+Finally, it shows how @code{awk} can be used in conjunction with other
+utility programs to do a useful task of some complexity with a minimum of
+effort.  Some explanations follow the program listing.
+
+@example
+awk '
+# Print list of word frequencies
+@{
+    for (i = 1; i <= NF; i++)
+        freq[$i]++
+@}
+
+END @{
+    for (word in freq)
+        printf "%s\t%d\n", word, freq[word]
+@}'
+@end example
+
+The first thing to notice about this program is that it has two rules.  The
+first rule, because it has an empty pattern, is executed on every line of
+the input.  It uses @code{awk}'s field-accessing mechanism
+(@pxref{Fields, ,Examining Fields}) to pick out the individual words from
+the line, and the built-in variable @code{NF} (@pxref{Built-in Variables})
+to know how many fields are available.
+
+For each input word, an element of the array @code{freq} is incremented to
+reflect that the word has been seen an additional time.
+
+The second rule, because it has the pattern @code{END}, is not executed
+until the input has been exhausted.  It prints out the contents of the
+@code{freq} table that has been built up inside the first action.
+
+This program has several problems that would prevent it from being
+useful by itself on real text files:
+
+@itemize @bullet
+@item
+Words are detected using the @code{awk} convention that fields are
+separated by whitespace and that other characters in the input (except
+newlines) don't have any special meaning to @code{awk}.  This means that
+punctuation characters count as part of words.
+
+@item
+The @code{awk} language considers upper- and lower-case characters to be
+distinct.  Therefore, @samp{bartender} and @samp{Bartender} are not treated
+as the same word.  This is undesirable since, in normal text, words
+are capitalized if they begin sentences, and a frequency analyzer should not
+be sensitive to capitalization.
+
+@iftex
+@page
+@end iftex
+@item
+The output does not come out in any useful order.  You're more likely to be
+interested in which words occur most frequently, or having an alphabetized
+table of how frequently each word occurs.
+@end itemize
+
+The way to solve these problems is to use some of the more advanced
+features of the @code{awk} language.  First, we use @code{tolower} to remove
+case distinctions.  Next, we use @code{gsub} to remove punctuation
+characters.  Finally, we use the system @code{sort} utility to process the
+output of the @code{awk} script.  Here is the new version of
+the program:
+
+@findex wordfreq.sh
+@example
+@c file eg/prog/wordfreq.awk
+# Print list of word frequencies
+@{
+    $0 = tolower($0)    # remove case distinctions
+    gsub(/[^a-z0-9_ \t]/, "", $0)  # remove punctuation
+    for (i = 1; i <= NF; i++)
+        freq[$i]++
+@}
+@c endfile
+
+END @{
+    for (word in freq)
+        printf "%s\t%d\n", word, freq[word]
+@}
+@end example
+
+Assuming we have saved this program in a file named @file{wordfreq.awk},
+and that the data is in @file{file1}, the following pipeline
+
+@example
+awk -f wordfreq.awk file1 | sort +1 -nr
+@end example
+
+@noindent
+produces a table of the words appearing in @file{file1} in order of
+decreasing frequency.
+
+The @code{awk} program suitably massages the data and produces a word
+frequency table, which is not ordered.
+
+The @code{awk} script's output is then sorted by the @code{sort} utility and
+printed on the terminal.  The options given to @code{sort} in this example
+specify to sort using the second field of each input line (skipping one field),
+that the sort keys should be treated as numeric quantities (otherwise
+@samp{15} would come before @samp{5}), and that the sorting should be done
+in descending (reverse) order.
+
+We could have even done the @code{sort} from within the program, by
+changing the @code{END} action to:
+
+@example
+@c file eg/prog/wordfreq.awk
+END @{
+    sort = "sort +1 -nr"
+    for (word in freq)
+        printf "%s\t%d\n", word, freq[word] | sort
+    close(sort)
+@}
+@c endfile
+@end example
+
+You would have to use this way of sorting on systems that do not
+have true pipes.
+
+See the general operating system documentation for more information on how
+to use the @code{sort} program.
+
+@node History Sorting, Extract Program, Word Sorting, Miscellaneous Programs
+@subsection Removing Duplicates from Unsorted Text
+
+The @code{uniq} program
+(@pxref{Uniq Program, ,Printing Non-duplicated Lines of Text}),
+removes duplicate lines from @emph{sorted} data.
+
+Suppose, however, you need to remove duplicate lines from a data file, but
+that you wish to preserve the order the lines are in?  A good example of
+this might be a shell history file.  The history file keeps a copy of all
+the commands you have entered, and it is not unusual to repeat a command
+several times in a row.  Occasionally you might wish to compact the history
+by removing duplicate entries.  Yet it is desirable to maintain the order
+of the original commands.
+
+This simple program does the job.  It uses two arrays.  The @code{data}
+array is indexed by the text of each line.
+For each line, @code{data[$0]} is incremented.
+
+If a particular line has not
+been seen before, then @code{data[$0]} will be zero.
+In that case, the text of the line is stored in @code{lines[count]}.
+Each element of @code{lines} is a unique command, and the indices of
+@code{lines} indicate the order in which those lines were encountered.
+The @code{END} rule simply prints out the lines, in order.
+
+@cindex Rakitzis, Byron
+@findex histsort.awk
+@example
+@group
+@c file eg/prog/histsort.awk
+# histsort.awk --- compact a shell history file
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, Public Domain
+# May 1993
+
+# Thanks to Byron Rakitzis for the general idea
+@{
+    if (data[$0]++ == 0)
+        lines[++count] = $0
+@}
+
+END @{
+    for (i = 1; i <= count; i++)
+        print lines[i]
+@}
+@c endfile
+@end group
+@end example
+
+This program also provides a foundation for generating other useful
+information.  For example, using the following @code{print} satement in the
+@code{END} rule would indicate how often a particular command was used.
+
+@example
+print data[lines[i]], lines[i]
+@end example
+
+This works because @code{data[$0]} was incremented each time a line was
+seen.
+
+@node Extract Program, Simple Sed, History Sorting, Miscellaneous Programs
+@subsection Extracting Programs from Texinfo Source Files
+
+@iftex
+Both this chapter and the previous chapter
+(@ref{Library Functions, ,A Library of @code{awk} Functions}),
+present a large number of @code{awk} programs.
+@end iftex
+@ifinfo
+The nodes
+@ref{Library Functions, ,A Library of @code{awk} Functions},
+and @ref{Sample Programs, ,Practical @code{awk} Programs},
+are the top level nodes for a large number of @code{awk} programs.
+@end ifinfo
+If you wish to experiment with these programs, it is tedious to have to type
+them in by hand.  Here we present a program that can extract parts of a
+Texinfo input file into separate files.
+
+This @value{DOCUMENT} is written in Texinfo, the GNU project's document
+formatting language.  A single Texinfo source file can be used to produce both
+printed and on-line documentation.
+@iftex
+Texinfo is fully documented in @cite{Texinfo---The GNU Documentation Format},
+available from the Free Software Foundation.
+@end iftex
+@ifinfo
+The Texinfo language is described fully, starting with
+@ref{Top, , Introduction, texi, Texinfo---The GNU Documentation Format}.
+@end ifinfo
+
+For our purposes, it is enough to know three things about Texinfo input
+files.
+
+@itemize @bullet
+@item
+The ``at'' symbol, @samp{@@}, is special in Texinfo, much like @samp{\} in C
+or @code{awk}.  Literal @samp{@@} symbols are represented in Texinfo source
+files as @samp{@@@@}.
+
+@item
+Comments start with either @samp{@@c} or @samp{@@comment}.
+The file extraction program will work by using special comments that start
+at the beginning of a line.
+
+@item
+Example text that should not be split across a page boundary is bracketed
+between lines containing @samp{@@group} and @samp{@@end group} commands.
+@end itemize
+
+The following program, @file{extract.awk}, reads through a Texinfo source
+file, and does two things, based on the special comments.
+Upon seeing @samp{@w{@@c system @dots{}}},
+it runs a command, by extracting the command text from the
+control line and passing it on to the @code{system} function
+(@pxref{I/O Functions, ,Built-in Functions for Input/Output}).
+Upon seeing @samp{@@c file @var{filename}}, each subsequent line is sent to
+the file @var{filename}, until @samp{@@c endfile} is encountered.
+The rules in @file{extract.awk} will match either @samp{@@c} or
+@samp{@@comment} by letting the @samp{omment} part be optional.
+Lines containing @samp{@@group} and @samp{@@end group} are simply removed.
+@file{extract.awk} uses the @code{join} library function
+(@pxref{Join Function, ,Merging an Array Into a String}).
+
+The example programs in the on-line Texinfo source for @cite{@value{TITLE}}
+(@file{gawk.texi}) have all been bracketed inside @samp{file},
+and @samp{endfile} lines.  The @code{gawk} distribution uses a copy of
+@file{extract.awk} to extract the sample
+programs and install many of them in a standard directory, where
+@code{gawk} can find them.
+
+@file{extract.awk} begins by setting @code{IGNORECASE} to one, so that
+mixed upper-case and lower-case letters in the directives won't matter.
+
+The first rule handles calling @code{system}, checking that a command was
+given (@code{NF} is at least three), and also checking that the command
+exited with a zero exit status, signifying OK.
+
+@findex extract.awk
+@example
+@c @group
+@c file eg/prog/extract.awk
+# extract.awk --- extract files and run programs
+#                 from texinfo files
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, Public Domain
+# May 1993
+
+BEGIN    @{ IGNORECASE = 1 @}
+
+@group
+/^@@c(omment)?[ \t]+system/    \
+@{
+    if (NF < 3) @{
+        e = (FILENAME ":" FNR)
+        e = (e  ": badly formed `system' line")
+        print e > "/dev/stderr"
+        next
+    @}
+    $1 = ""
+    $2 = ""
+    stat = system($0)
+    if (stat != 0) @{
+        e = (FILENAME ":" FNR)
+        e = (e ": warning: system returned " stat)
+        print e > "/dev/stderr"
+    @}
+@}
+@end group
+@c endfile
+@end example
+
+@noindent
+The variable @code{e} is used so that the function
+fits nicely on the
+@iftex
+page.
+@end iftex
+@ifinfo
+screen.
+@end ifinfo
+
+The second rule handles moving data into files.  It verifies that a file
+name was given in the directive.  If the file named is not the current file,
+then the current file is closed.  This means that an @samp{@@c endfile} was
+not given for that file.  (We should probably print a diagnostic in this
+case, although at the moment we do not.)
+
+The @samp{for} loop does the work.  It reads lines using @code{getline}
+(@pxref{Getline, ,Explicit Input with @code{getline}}).
+For an unexpected end of file, it calls the @code{@w{unexpected_eof}}
+function.  If the line is an ``endfile'' line, then it breaks out of
+the loop.
+If the line is an @samp{@@group} or @samp{@@end group} line, then it
+ignores it, and goes on to the next line.
+
+Most of the work is in the following few lines.  If the line has no @samp{@@}
+symbols, it can be printed directly.  Otherwise, each leading @samp{@@} must be
+stripped off.
+
+To remove the @samp{@@} symbols, the line is split into separate elements of
+the array @code{a}, using the @code{split} function
+(@pxref{String Functions, ,Built-in Functions for String Manipulation}).
+Each element of @code{a} that is empty indicates two successive @samp{@@}
+symbols in the original line.  For each two empty elements (@samp{@@@@} in
+the original file), we have to add back in a single @samp{@@} symbol.
+
+When the processing of the array is finished, @code{join} is called with the
+value of @code{SUBSEP}, to rejoin the pieces back into a single
+line.  That line is then printed to the output file.
+
+@example
+@c @group
+@c file eg/prog/extract.awk
+/^@@c(omment)?[ \t]+file/    \
+@{
+@group
+    if (NF != 3) @{
+        e = (FILENAME ":" FNR ": badly formed `file' line")
+        print e > "/dev/stderr"
+        next
+    @}
+@end group
+    if ($3 != curfile) @{
+        if (curfile != "")
+            close(curfile)
+        curfile = $3
+    @}
+
+    for (;;) @{
+        if ((getline line) <= 0)
+            unexpected_eof()
+        if (line ~ /^@@c(omment)?[ \t]+endfile/)
+            break
+        else if (line ~ /^@@(end[ \t]+)?group/)
+            continue
+        if (index(line, "@@") == 0) @{
+            print line > curfile
+            continue
+        @}
+        n = split(line, a, "@@")
+@group
+        # if a[1] == "", means leading @@,
+        # don't add one back in.
+@end group
+        for (i = 2; i <= n; i++) @{
+            if (a[i] == "") @{ # was an @@@@
+                a[i] = "@@"
+                if (a[i+1] == "")
+                    i++
+            @}
+        @}
+        print join(a, 1, n, SUBSEP) > curfile
+    @}
+@}
+@c endfile
+@c @end group
+@end example
+
+An important thing to note is the use of the @samp{>} redirection.
+Output done with @samp{>} only opens the file once; it stays open and
+subsequent output is appended to the file
+(@pxref{Redirection, , Redirecting Output of @code{print} and @code{printf}}).
+This allows us to easily mix program text and explanatory prose for the same
+sample source file (as has been done here!) without any hassle.  The file is
+only closed when a new data file name is encountered, or at the end of the
+input file.
+
+Finally, the function @code{@w{unexpected_eof}} prints an appropriate
+error message and then exits.
+
+The @code{END} rule handles the final cleanup, closing the open file.
+
+@example
+@c file eg/prog/extract.awk
+@group
+function unexpected_eof()
+@{
+    printf("%s:%d: unexpected EOF or error\n", \
+        FILENAME, FNR) > "/dev/stderr"
+    exit 1
+@}
+@end group
+
+END @{
+    if (curfile)
+        close(curfile)
+@}
+@c endfile
+@end example
+
+@node Simple Sed, Igawk Program, Extract Program, Miscellaneous Programs
+@subsection A Simple Stream Editor
+
+@cindex @code{sed} utility
+The @code{sed} utility is a ``stream editor,'' a program that reads a
+stream of data, makes changes to it, and passes the modified data on.
+It is often used to make global changes to a large file, or to a stream
+of data generated by a pipeline of commands.
+
+While @code{sed} is a complicated program in its own right, its most common
+use is to perform global substitutions in the middle of a pipeline:
+
+@example
+command1 < orig.data | sed 's/old/new/g' | command2 > result
+@end example
+
+Here, the @samp{s/old/new/g} tells @code{sed} to look for the regexp
+@samp{old} on each input line, and replace it with the text @samp{new},
+globally (i.e.@: all the occurrences on a line).  This is similar to
+@code{awk}'s @code{gsub} function
+(@pxref{String Functions, , Built-in Functions for String Manipulation}).
+
+The following program, @file{awksed.awk}, accepts at least two command line
+arguments; the pattern to look for and the text to replace it with. Any
+additional arguments are treated as data file names to process. If none
+are provided, the standard input is used.
+
+@cindex Brennan, Michael
+@cindex @code{awksed}
+@cindex simple stream editor
+@cindex stream editor, simple
+@example
+@c @group
+@c file eg/prog/awksed.awk
+# awksed.awk --- do s/foo/bar/g using just print
+#    Thanks to Michael Brennan for the idea
+
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, Public Domain
+# August 1995
+
+function usage()
+@{
+    print "usage: awksed pat repl [files...]" > "/dev/stderr"
+    exit 1
+@}
+
+BEGIN @{
+    # validate arguments
+    if (ARGC < 3)
+        usage()
+
+    RS = ARGV[1]
+    ORS = ARGV[2]
+
+    # don't use arguments as files
+    ARGV[1] = ARGV[2] = ""
+@}
+
+# look ma, no hands!
+@{
+    if (RT == "")
+        printf "%s", $0
+    else
+        print
+@}
+@c endfile
+@c @end group
+@end example
+
+The program relies on @code{gawk}'s ability to have @code{RS} be a regexp
+and on the setting of @code{RT} to the actual text that terminated the
+record (@pxref{Records, ,How Input is Split into Records}).
+
+The idea is to have @code{RS} be the pattern to look for. @code{gawk}
+will automatically set @code{$0} to the text between matches of the pattern.
+This is text that we wish to keep, unmodified.  Then, by setting @code{ORS}
+to the replacement text, a simple @code{print} statement will output the
+text we wish to keep, followed by the replacement text.
+
+There is one wrinkle to this scheme, which is what to do if the last record
+doesn't end with text that matches @code{RS}?  Using a @code{print}
+statement unconditionally prints the replacement text, which is not correct.
+
+However, if the file did not end in text that matches @code{RS}, @code{RT}
+will be set to the null string.  In this case, we can print @code{$0} using
+@code{printf}
+(@pxref{Printf, ,Using @code{printf} Statements for Fancier Printing}).
+
+The @code{BEGIN} rule handles the setup, checking for the right number
+of arguments, and calling @code{usage} if there is a problem. Then it sets
+@code{RS} and @code{ORS} from the command line arguments, and sets
+@code{ARGV[1]} and @code{ARGV[2]} to the null string, so that they will
+not be treated as file names
+(@pxref{ARGC and ARGV, , Using @code{ARGC} and @code{ARGV}}).
+
+The @code{usage} function prints an error message and exits.
+
+Finally, the single rule handles the printing scheme outlined above,
+using @code{print} or @code{printf} as appropriate, depending upon the
+value of @code{RT}.
+
+@ignore
+Exercise, compare the performance of this version with the more
+straightforward:
+
+BEGIN {
+    pat = ARGV[1]
+    repl = ARGV[2]
+    ARGV[1] = ARGV[2] = ""
+}
+
+{ gsub(pat, repl); print }
+
+Exercise: what are the advantages and disadvantages of this version vs. sed?
+  Advantage: egrep regexps
+             speed (?)
+  Disadvantage: no & in replacement text
+
+Others?
+@end ignore
+
+@node Igawk Program, , Simple Sed, Miscellaneous Programs
+@subsection An Easy Way to Use Library Functions
+
+Using library functions in @code{awk} can be very beneficial. It
+encourages code re-use and the writing of general functions. Programs are
+smaller, and therefore clearer.
+However, using library functions is only easy when writing @code{awk}
+programs; it is painful when running them, requiring multiple @samp{-f}
+options.  If @code{gawk} is unavailable, then so too is the @code{AWKPATH}
+environment variable and the ability to put @code{awk} functions into a
+library directory (@pxref{Options, ,Command Line Options}).
+
+It would be nice to be able to write programs like so:
+
+@example
+# library functions
+@@include getopt.awk
+@@include join.awk
+@dots{}
+
+# main program
+BEGIN @{
+    while ((c = getopt(ARGC, ARGV, "a:b:cde")) != -1)
+        @dots{}
+    @dots{}
+@}
+@end example
+
+The following program, @file{igawk.sh}, provides this service.
+It simulates @code{gawk}'s searching of the @code{AWKPATH} variable,
+and also allows @dfn{nested} includes; i.e.@: a file that has been included
+with @samp{@@include} can contain further @samp{@@include} statements.
+@code{igawk} will make an effort to only include files once, so that nested
+includes don't accidentally include a library function twice.
+
+@code{igawk} should behave externally just like @code{gawk}.  This means it
+should accept all of @code{gawk}'s command line arguments, including the
+ability to have multiple source files specified via @samp{-f}, and the
+ability to mix command line and library source files.
+
+The program is written using the POSIX Shell (@code{sh}) command language.
+The way the program works is as follows:
+
+@enumerate
+@item
+Loop through the arguments, saving anything that doesn't represent
+@code{awk} source code for later, when the expanded program is run.
+
+@item
+For any arguments that do represent @code{awk} text, put the arguments into
+a temporary file that will be expanded.  There are two cases.
+
+@enumerate a
+@item
+Literal text, provided with @samp{--source} or @samp{--source=}.  This
+text is just echoed directly.  The @code{echo} program will automatically
+supply a trailing newline.
+
+@item
+File names provided with @samp{-f}.  We use a neat trick, and echo
+@samp{@@include @var{filename}} into the temporary file.  Since the file
+inclusion program will work the way @code{gawk} does, this will get the text
+of the file included into the program at the correct point.
+@end enumerate
+
+@item
+Run an @code{awk} program (naturally) over the temporary file to expand
+@samp{@@include} statements.  The expanded program is placed in a second
+temporary file.
+
+@item
+Run the expanded program with @code{gawk} and any other original command line
+arguments that the user supplied (such as the data file names).
+@end enumerate
+
+The initial part of the program turns on shell tracing if the first
+argument was @samp{debug}.  Otherwise, a shell @code{trap} statement
+arranges to clean up any temporary files on program exit or upon an
+interrupt.
+
+@c 2e: For the temp file handling, go with Darrel's ig=${TMP:-/tmp}/igs.$$
+@c 2e: or something as similar as possible.
+
+The next part loops through all the command line arguments.
+There are several cases of interest.
+
+@table @code
+@item --
+This ends the arguments to @code{igawk}.  Anything else should be passed on
+to the user's @code{awk} program without being evaluated.
+
+@item -W
+This indicates that the next option is specific to @code{gawk}.  To make
+argument processing easier, the @samp{-W} is appended to the front of the
+remaining arguments and the loop continues.  (This is an @code{sh}
+programming trick.  Don't worry about it if you are not familiar with
+@code{sh}.)
+
+@item -v
+@itemx -F
+These are saved and passed on to @code{gawk}.
+
+@item -f
+@itemx --file
+@itemx --file=
+@itemx -Wfile=
+The file name is saved to the temporary file @file{/tmp/ig.s.$$} with an
+@samp{@@include} statement.
+The @code{sed} utility is used to remove the leading option part of the
+argument (e.g., @samp{--file=}).
+
+@item --source
+@itemx --source=
+@itemx -Wsource=
+The source text is echoed into @file{/tmp/ig.s.$$}.
+
+@iftex
+@page
+@end iftex
+@item --version
+@itemx --version
+@itemx -Wversion
+@code{igawk} prints its version number, and runs @samp{gawk --version}
+to get the @code{gawk} version information, and then exits.
+@end table
+
+If none of @samp{-f}, @samp{--file}, @samp{-Wfile}, @samp{--source},
+or @samp{-Wsource}, were supplied, then the first non-option argument
+should be the @code{awk} program.  If there are no command line
+arguments left, @code{igawk} prints an error message and exits.
+Otherwise, the first argument is echoed into @file{/tmp/ig.s.$$}.
+
+In any case, after the arguments have been processed,
+@file{/tmp/ig.s.$$} contains the complete text of the original @code{awk}
+program.
+
+The @samp{$$} in @code{sh} represents the current process ID number.
+It is often used in shell programs to generate unique temporary file
+names.  This allows multiple users to run @code{igawk} without worrying
+that the temporary file names will clash.
+
+@cindex @code{sed} utility
+Here's the program:
+
+@findex igawk.sh
+@example
+@c @group
+@c file eg/prog/igawk.sh
+#! /bin/sh
+
+# igawk --- like gawk but do @@include processing
+# Arnold Robbins, arnold@@gnu.ai.mit.edu, Public Domain
+# July 1993
+
+if [ "$1" = debug ]
+then
+    set -x
+    shift
+else
+    # cleanup on exit, hangup, interrupt, quit, termination
+    trap 'rm -f /tmp/ig.[se].$$' 0 1 2 3 15
+fi
+
+while [ $# -ne 0 ] # loop over arguments
+do
+    case $1 in
+    --)     shift; break;;
+
+    -W)     shift
+            set -- -W"$@@"
+            continue;;
+
+    -[vF])  opts="$opts $1 '$2'"
+            shift;;
+
+    -[vF]*) opts="$opts '$1'" ;;
+
+    -f)     echo @@include "$2" >> /tmp/ig.s.$$
+            shift;;
+
+    -f*)    f=`echo "$1" | sed 's/-f//'`
+            echo @@include "$f" >> /tmp/ig.s.$$ ;;
+
+    -?file=*)    # -Wfile or --file
+            f=`echo "$1" | sed 's/-.file=//'`
+            echo @@include "$f" >> /tmp/ig.s.$$ ;;
+
+    -?file)    # get arg, $2
+            echo @@include "$2" >> /tmp/ig.s.$$
+            shift;;
+
+    -?source=*)    # -Wsource or --source
+            t=`echo "$1" | sed 's/-.source=//'`
+            echo "$t" >> /tmp/ig.s.$$ ;;
+
+    -?source)  # get arg, $2
+            echo "$2" >> /tmp/ig.s.$$
+            shift;;
+
+    -?version)
+            echo igawk: version 1.0 1>&2
+            gawk --version
+            exit 0 ;;
+
+    -[W-]*)    opts="$opts '$1'" ;;
+
+    *)      break;;
+    esac
+    shift
+done
+
+if [ ! -s /tmp/ig.s.$$ ]
+then
+    if [ -z "$1" ]
+    then
+         echo igawk: no program! 1>&2
+         exit 1
+    else
+        echo "$1" > /tmp/ig.s.$$
+        shift
+    fi
+fi
+
+# at this point, /tmp/ig.s.$$ has the program
+@c endfile
+@c @end group
+@end example
+
+The @code{awk} program to process @samp{@@include} directives reads through
+the program, one line at a time using @code{getline}
+(@pxref{Getline, ,Explicit Input with @code{getline}}).
+The input file names and @samp{@@include} statements are managed using a
+stack.  As each @samp{@@include} is encountered, the current file name is
+``pushed'' onto the stack, and the file named in the @samp{@@include}
+directive becomes
+the current file name.  As each file is finished, the stack is ``popped,''
+and the previous input file becomes the current input file again.
+The process is started by making the original file the first one on the
+stack.
+
+The @code{pathto} function does the work of finding the full path to a
+file.  It simulates @code{gawk}'s behavior when searching the @code{AWKPATH}
+environment variable
+(@pxref{AWKPATH Variable, ,The @code{AWKPATH} Environment Variable}).
+If a file name has a @samp{/} in it, no path search
+is done. Otherwise, the file name is concatenated with the name of each
+directory in the path, and an attempt is made to open the generated file
+name.  The only way in @code{awk} to test if a file can be read is to go
+ahead and try to read it with @code{getline}; that is what @code{pathto}
+does.  If the file can be read, it is closed, and the file name is
+returned.
+@ignore
+An alternative way to test for the file's existence would be to call
+@samp{system("test -r " t)}, which uses the @code{test} utility to
+see if the file exists and is readable.  The disadvantage to this method
+is that it requires creating an extra process, and can thus be slightly
+slower.
+@end ignore
+
+@example
+@c @group
+@c file eg/prog/igawk.sh
+gawk -- '
+# process @@include directives
+
+function pathto(file,    i, t, junk)
+@{
+    if (index(file, "/") != 0)
+        return file
+
+    for (i = 1; i <= ndirs; i++) @{
+        t = (pathlist[i] "/" file)
+        if ((getline junk < t) > 0) @{
+            # found it
+            close(t)
+            return t
+        @}
+    @}
+    return ""
+@}
+@c endfile
+@c @end group
+@end example
+
+The main program is contained inside one @code{BEGIN} rule.  The first thing it
+does is set up the @code{pathlist} array that @code{pathto} uses.  After
+splitting the path on @samp{:}, null elements are replaced with @code{"."},
+which represents the current directory.
+
+@example
+@c @group
+@c file eg/prog/igawk.sh
+BEGIN @{
+    path = ENVIRON["AWKPATH"]
+    ndirs = split(path, pathlist, ":")
+    for (i = 1; i <= ndirs; i++) @{
+        if (pathlist[i] == "")
+            pathlist[i] = "."
+    @}
+@c endfile
+@c @end group
+@end example
+
+The stack is initialized with @code{ARGV[1]}, which will be @file{/tmp/ig.s.$$}.
+The main loop comes next.  Input lines are read in succession. Lines that
+do not start with @samp{@@include} are printed verbatim.
+
+If the line does start with @samp{@@include}, the file name is in @code{$2}.
+@code{pathto} is called to generate the full path.  If it could not, then we
+print an error message and continue.
+
+The next thing to check is if the file has been included already.  The
+@code{processed} array is indexed by the full file name of each included
+file, and it tracks this information for us.  If the file has been
+seen, a warning message is printed. Otherwise, the new file name is
+pushed onto the stack and processing continues.
+
+Finally, when @code{getline} encounters the end of the input file, the file
+is closed and the stack is popped.  When @code{stackptr} is less than zero,
+the program is done.
+
+@example
+@c @group
+@c file eg/prog/igawk.sh
+    stackptr = 0
+    input[stackptr] = ARGV[1] # ARGV[1] is first file
+
+    for (; stackptr >= 0; stackptr--) @{
+        while ((getline < input[stackptr]) > 0) @{
+            if (tolower($1) != "@@include") @{
+                print
+                continue
+            @}
+            fpath = pathto($2)
+            if (fpath == "") @{
+                printf("igawk:%s:%d: cannot find %s\n", \
+                    input[stackptr], FNR, $2) > "/dev/stderr"
+                continue
+            @}
+@group
+            if (! (fpath in processed)) @{
+                processed[fpath] = input[stackptr]
+                input[++stackptr] = fpath
+            @} else
+                print $2, "included in", input[stackptr], \
+                    "already included in", \
+                    processed[fpath] > "/dev/stderr"
+        @}
+@end group
+@group
+        close(input[stackptr])
+    @}
+@}' /tmp/ig.s.$$ > /tmp/ig.e.$$
+@end group
+@c endfile
+@c @end group
+@end example
+
+The last step is to call @code{gawk} with the expanded program and the original
+options and command line arguments that the user supplied.  @code{gawk}'s
+exit status is passed back on to @code{igawk}'s calling program.
+
+@c this causes more problems than it solves, so leave it out.
+@ignore
+The special file @file{/dev/null} is passed as a data file to @code{gawk}
+to handle an interesting case. Suppose that the user's program only has
+a @code{BEGIN} rule, and there are no data files to read. The program should exit without reading any data
+files.  However, suppose that an included library file defines an @code{END}
+rule of its own. In this case, @code{gawk} will hang, reading standard
+input. In order to avoid this, @file{/dev/null} is explicitly to the
+command line. Reading from @file{/dev/null} always returns an immediate
+end of file indication.
+
+@c Hmm. Add /dev/null if $# is 0?  Still messes up ARGV. Sigh.
+@end ignore
+
+@example
+@c @group
+@c file eg/prog/igawk.sh
+eval gawk -f /tmp/ig.e.$$ $opts -- "$@@"
+
+exit $?
+@c endfile
+@c @end group
+@end example
+
+This version of @code{igawk} represents my third attempt at this program.
+There are three key simplifications that made the program work better.
+
+@enumerate
+@item
+Using @samp{@@include} even for the files named with @samp{-f} makes building
+the initial collected @code{awk} program much simpler; all the
+@samp{@@include} processing can be done once.
+
+@item
+The @code{pathto} function doesn't try to save the line read with
+@code{getline} when testing for the file's accessibility.  Trying to save
+this line for use with the main program complicates things considerably.
+@c what problem does this engender though - exercise
+@c answer, reading from "-" or /dev/stdin
+
+@item
+Using a @code{getline} loop in the @code{BEGIN} rule does it all in one
+place.  It is not necessary to call out to a separate loop for processing
+nested @samp{@@include} statements.
+@end enumerate
+
+Also, this program illustrates that it is often worthwhile to combine
+@code{sh} and @code{awk} programming together.  You can usually accomplish
+quite a lot, without having to resort to low-level programming in C or C++, and it
+is frequently easier to do certain kinds of string and argument manipulation
+using the shell than it is in @code{awk}.
+
+Finally, @code{igawk} shows that it is not always necessary to add new
+features to a program; they can often be layered on top.  With @code{igawk},
+there is no real reason to build @samp{@@include} processing into
+@code{gawk} itself.
+
+As an additional example of this, consider the idea of having two
+files in a directory in the search path.
+
+@table @file
+@item default.awk
+This file would contain a set of default library functions, such
+as @code{getopt} and @code{assert}.
+
+@item site.awk
+This file would contain library functions that are specific to a site or
+installation, i.e.@: locally developed functions.
+Having a separate file allows @file{default.awk} to change with
+new @code{gawk} releases, without requiring the system administrator to
+update it each time by adding the local functions.
+@end table
+
+One user
+@c Karl Berry, karl@ileaf.com, 10/95
+suggested that @code{gawk} be modified to automatically read these files
+upon startup.  Instead, it would be very simple to modify @code{igawk}
+to do this. Since @code{igawk} can process nested @samp{@@include}
+directives, @file{default.awk} could simply contain @samp{@@include}
+statements for the desired library functions.
+
+@c Exercise: make this change
+
+@node Language History, Gawk Summary, Sample Programs, Top
+@chapter The Evolution of the @code{awk} Language
+
+This @value{DOCUMENT} describes the GNU implementation of @code{awk}, which follows
+the POSIX specification.  Many @code{awk} users are only familiar
+with the original @code{awk} implementation in Version 7 Unix.
+(This implementation was the basis for @code{awk} in Berkeley Unix,
+through 4.3--Reno.  The 4.4 release of Berkeley Unix uses @code{gawk} 2.15.2
+for its version of @code{awk}.) This chapter briefly describes the
+evolution of the @code{awk} language, with cross references to other parts
+of the @value{DOCUMENT} where you can find more information.
+
+@menu
+* V7/SVR3.1::                   The major changes between V7 and System V
+                                Release 3.1.
+* SVR4::                        Minor changes between System V Releases 3.1
+                                and 4.
+* POSIX::                       New features from the POSIX standard.
+* BTL::                         New features from the AT&T Bell Laboratories
+                                version of @code{awk}.
+* POSIX/GNU::                   The extensions in @code{gawk} not in POSIX
+                                @code{awk}.
+@end menu
+
+@node V7/SVR3.1, SVR4, Language History, Language History
+@section Major Changes between V7 and SVR3.1
+
+The @code{awk} language evolved considerably between the release of
+Version 7 Unix (1978) and the new version first made generally available in
+System V Release 3.1 (1987).  This section summarizes the changes, with
+cross-references to further details.
+
+@itemize @bullet
+@item
+The requirement for @samp{;} to separate rules on a line
+(@pxref{Statements/Lines, ,@code{awk} Statements Versus Lines}).
+
+@item
+User-defined functions, and the @code{return} statement
+(@pxref{User-defined, ,User-defined Functions}).
+
+@item
+The @code{delete} statement (@pxref{Delete, ,The @code{delete} Statement}).
+
+@item
+The @code{do}-@code{while} statement
+(@pxref{Do Statement, ,The @code{do}-@code{while} Statement}).
+
+@item
+The built-in functions @code{atan2}, @code{cos}, @code{sin}, @code{rand} and
+@code{srand} (@pxref{Numeric Functions, ,Numeric Built-in Functions}).
+
+@item
+The built-in functions @code{gsub}, @code{sub}, and @code{match}
+(@pxref{String Functions, ,Built-in Functions for String Manipulation}).
+
+@item
+The built-in functions @code{close}, and @code{system}
+(@pxref{I/O Functions, ,Built-in Functions for Input/Output}).
+
+@item
+The @code{ARGC}, @code{ARGV}, @code{FNR}, @code{RLENGTH}, @code{RSTART},
+and @code{SUBSEP} built-in variables (@pxref{Built-in Variables}).
+
+@item
+The conditional expression using the ternary operator @samp{?:}
+(@pxref{Conditional Exp, ,Conditional Expressions}).
+
+@item
+The exponentiation operator @samp{^}
+(@pxref{Arithmetic Ops, ,Arithmetic Operators}) and its assignment operator
+form @samp{^=} (@pxref{Assignment Ops, ,Assignment Expressions}).
+
+@item
+C-compatible operator precedence, which breaks some old @code{awk}
+programs (@pxref{Precedence, ,Operator Precedence (How Operators Nest)}).
+
+@item
+Regexps as the value of @code{FS}
+(@pxref{Field Separators, ,Specifying How Fields are Separated}), and as the
+third argument to the @code{split} function
+(@pxref{String Functions, ,Built-in Functions for String Manipulation}).
+
+@item
+Dynamic regexps as operands of the @samp{~} and @samp{!~} operators
+(@pxref{Regexp Usage, ,How to Use Regular Expressions}).
+
+@item
+The escape sequences @samp{\b}, @samp{\f}, and @samp{\r}
+(@pxref{Escape Sequences}).
+(Some vendors have updated their old versions of @code{awk} to
+recognize @samp{\r}, @samp{\b}, and @samp{\f}, but this is not
+something you can rely on.)
+
+@item
+Redirection of input for the @code{getline} function
+(@pxref{Getline, ,Explicit Input with @code{getline}}).
+
+@item
+Multiple @code{BEGIN} and @code{END} rules
+(@pxref{BEGIN/END, ,The @code{BEGIN} and @code{END} Special Patterns}).
+
+@item
+Multi-dimensional arrays
+(@pxref{Multi-dimensional, ,Multi-dimensional Arrays}).
+@end itemize
+
+@node SVR4, POSIX, V7/SVR3.1, Language History
+@section Changes between SVR3.1 and SVR4
+
+@cindex @code{awk} language, V.4 version
+The System V Release 4 version of Unix @code{awk} added these features
+(some of which originated in @code{gawk}):
+
+@itemize @bullet
+@item
+The @code{ENVIRON} variable (@pxref{Built-in Variables}).
+
+@item
+Multiple @samp{-f} options on the command line
+(@pxref{Options, ,Command Line Options}).
+
+@item
+The @samp{-v} option for assigning variables before program execution begins
+(@pxref{Options, ,Command Line Options}).
+
+@item
+The @samp{--} option for terminating command line options.
+
+@item
+The @samp{\a}, @samp{\v}, and @samp{\x} escape sequences
+(@pxref{Escape Sequences}).
+
+@item
+A defined return value for the @code{srand} built-in function
+(@pxref{Numeric Functions, ,Numeric Built-in Functions}).
+
+@item
+The @code{toupper} and @code{tolower} built-in string functions
+for case translation
+(@pxref{String Functions, ,Built-in Functions for String Manipulation}).
+
+@item
+A cleaner specification for the @samp{%c} format-control letter in the
+@code{printf} function
+(@pxref{Control Letters, ,Format-Control Letters}).
+
+@item
+The ability to dynamically pass the field width and precision (@code{"%*.*d"})
+in the argument list of the @code{printf} function
+(@pxref{Control Letters, ,Format-Control Letters}).
+
+@item
+The use of regexp constants such as @code{/foo/} as expressions, where
+they are equivalent to using the matching operator, as in @samp{$0 ~ /foo/}
+(@pxref{Using Constant Regexps, ,Using Regular Expression Constants}).
+@end itemize
+
+@node POSIX, BTL, SVR4, Language History
+@section Changes between SVR4 and POSIX @code{awk}
+
+The POSIX Command Language and Utilities standard for @code{awk}
+introduced the following changes into the language:
+
+@itemize @bullet
+@item
+The use of @samp{-W} for implementation-specific options.
+
+@item
+The use of @code{CONVFMT} for controlling the conversion of numbers
+to strings (@pxref{Conversion, ,Conversion of Strings and Numbers}).
+
+@item
+The concept of a numeric string, and tighter comparison rules to go
+with it (@pxref{Typing and Comparison, ,Variable Typing and Comparison Expressions}).
+
+@item
+More complete documentation of many of the previously undocumented
+features of the language.
+@end itemize
+
+The following common extensions are not permitted by the POSIX
+standard:
+
+@c IMPORTANT! Keep this list in sync with the one in node Options
+
+@itemize @bullet
+@item
+@code{\x} escape sequences are not recognized
+(@pxref{Escape Sequences}).
+
+@item
+The synonym @code{func} for the keyword @code{function} is not
+recognized (@pxref{Definition Syntax, ,Function Definition Syntax}).
+
+@item
+The operators @samp{**} and @samp{**=} cannot be used in
+place of @samp{^} and @samp{^=} (@pxref{Arithmetic Ops, ,Arithmetic Operators},
+and also @pxref{Assignment Ops, ,Assignment Expressions}).
+
+@item
+Specifying @samp{-Ft} on the command line does not set the value
+of @code{FS} to be a single tab character
+(@pxref{Field Separators, ,Specifying How Fields are Separated}).
+
+@item
+The @code{fflush} built-in function is not supported
+(@pxref{I/O Functions, , Built-in Functions for Input/Output}).
+@end itemize
+
+@node BTL, POSIX/GNU, POSIX, Language History
+@section Extensions in the AT&T Bell Laboratories @code{awk}
+
+@cindex Kernighan, Brian
+Brian Kernighan, one of the original designers of Unix @code{awk},
+has made his version available via anonymous @code{ftp}
+(@pxref{Other Versions, ,Other Freely Available @code{awk} Implementations}).
+This section describes extensions in his version of @code{awk} that are
+not in POSIX @code{awk}.
+
+@itemize @bullet
+@item
+The @samp{-mf=@var{NNN}} and @samp{-mr=@var{NNN}} command line options
+to set the maximum number of fields, and the maximum
+record size, respectively
+(@pxref{Options, ,Command Line Options}).
+
+@item
+The @code{fflush} built-in function for flushing buffered output
+(@pxref{I/O Functions, ,Built-in Functions for Input/Output}).
+
+@ignore
+@item
+The @code{SYMTAB} array, that allows access to the internal symbol
+table of @code{awk}. This feature is not documented, largely because
+it is somewhat shakily implemented. For instance, you cannot access arrays
+or array elements through it.
+@end ignore
+@end itemize
+
+@node POSIX/GNU, , BTL, Language History
+@section Extensions in @code{gawk} Not in POSIX @code{awk}
+
+@cindex compatibility mode
+The GNU implementation, @code{gawk}, adds a number of features.
+This sections lists them in the order they were added to @code{gawk}.
+They can all be disabled with either the @samp{--traditional} or
+@samp{--posix} options
+(@pxref{Options, ,Command Line Options}).
+
+Version 2.10 of @code{gawk} introduced these features:
+
+@itemize @bullet
+@item
+The @code{AWKPATH} environment variable for specifying a path search for
+the @samp{-f} command line option
+(@pxref{Options, ,Command Line Options}).
+
+@item
+The @code{IGNORECASE} variable and its effects
+(@pxref{Case-sensitivity, ,Case-sensitivity in Matching}).
+
+@item
+The @file{/dev/stdin}, @file{/dev/stdout}, @file{/dev/stderr}, and
+@file{/dev/fd/@var{n}} file name interpretation
+(@pxref{Special Files, ,Special File Names in @code{gawk}}).
+@end itemize
+
+Version 2.13 of @code{gawk} introduced these features:
+
+@itemize @bullet
+@item
+The @code{FIELDWIDTHS} variable and its effects
+(@pxref{Constant Size, ,Reading Fixed-width Data}).
+
+@item
+The @code{systime} and @code{strftime} built-in functions for obtaining
+and printing time stamps
+(@pxref{Time Functions, ,Functions for Dealing with Time Stamps}).
+
+@item
+The @samp{-W lint} option to provide source code and run time error
+and portability checking
+(@pxref{Options, ,Command Line Options}).
+
+@item
+The @samp{-W compat} option to turn off these extensions
+(@pxref{Options, ,Command Line Options}).
+
+@item
+The @samp{-W posix} option for full POSIX compliance
+(@pxref{Options, ,Command Line Options}).
+@end itemize
+
+Version 2.14 of @code{gawk} introduced these features:
+
+@itemize @bullet
+@item
+The @code{next file} statement for skipping to the next data file
+(@pxref{Nextfile Statement, ,The @code{nextfile} Statement}).
+@end itemize
+
+Version 2.15 of @code{gawk} introduced these features:
+
+@itemize @bullet
+@item
+The @code{ARGIND} variable, that tracks the movement of @code{FILENAME}
+through @code{ARGV}  (@pxref{Built-in Variables}).
+
+@item
+The @code{ERRNO} variable, that contains the system error message when
+@code{getline} returns @minus{}1, or when @code{close} fails
+(@pxref{Built-in Variables}).
+
+@item
+The ability to use GNU-style long named options that start with @samp{--}
+(@pxref{Options, ,Command Line Options}).
+
+@item
+The @samp{--source} option for mixing command line and library
+file source code
+(@pxref{Options, ,Command Line Options}).
+
+@item
+The @file{/dev/pid}, @file{/dev/ppid}, @file{/dev/pgrpid}, and
+@file{/dev/user} file name interpretation
+(@pxref{Special Files, ,Special File Names in @code{gawk}}).
+@end itemize
+
+Version 3.0 of @code{gawk} introduced these features:
+
+@itemize @bullet
+@item
+The @code{next file} statement became @code{nextfile}
+(@pxref{Nextfile Statement, ,The @code{nextfile} Statement}).
+
+@item
+The @samp{--lint-old} option to
+warn about constructs that are not available in
+the original Version 7 Unix version of @code{awk}
+(@pxref{V7/SVR3.1, , Major Changes between V7 and SVR3.1}).
+
+@item
+The @samp{--traditional} option was added as a better name for
+@samp{--compat} (@pxref{Options, ,Command Line Options}).
+
+@item
+The ability for @code{FS} to be a null string, and for the third
+argument to @code{split} to be the null string
+(@pxref{Single Character Fields, , Making Each Character a Separate Field}).
+
+@item
+The ability for @code{RS} to be a regexp
+(@pxref{Records, , How Input is Split into Records}).
+
+@item
+The @code{RT} variable
+(@pxref{Records, , How Input is Split into Records}).
+
+@item
+The @code{gensub} function for more powerful text manipulation
+(@pxref{String Functions, , Built-in Functions for String Manipulation}).
+
+@item
+The @code{strftime} function acquired a default time format,
+allowing it to be called with no arguments
+(@pxref{Time Functions,  , Functions for Dealing with Time Stamps}).
+
+@item
+Full support for both POSIX and GNU regexps
+(@pxref{Regexp, , Regular Expressions}).
+
+@item
+The @samp{--re-interval} option to provide interval expressions in regexps
+(@pxref{Regexp Operators, , Regular Expression Operators}).
+
+@item
+@code{IGNORECASE} changed, now applying to string comparison as well
+as regexp operations
+(@pxref{Case-sensitivity, ,Case-sensitivity in Matching}).
+
+@item
+The @samp{-m} option and the @code{fflush} function from the
+Bell Labs research version of @code{awk}
+(@pxref{Options, ,Command Line Options}; also
+@pxref{I/O Functions, ,Built-in Functions for Input/Output}).
+
+@item
+The use of GNU Autoconf to control the configuration process
+(@pxref{Quick Installation, , Compiling @code{gawk} for Unix}).
+
+@item
+Amiga support
+(@pxref{Amiga Installation, ,Installing @code{gawk} on an Amiga}).
+
+@c XXX ADD MORE STUFF HERE
+
+@end itemize
+
+@node Gawk Summary, Installation, Language History, Top
+@appendix @code{gawk} Summary
+
+This appendix provides a brief summary of the @code{gawk} command line and the
+@code{awk} language.  It is designed to serve as ``quick reference.''  It is
+therefore terse, but complete.
+
+@menu
+* Command Line Summary::        Recapitulation of the command line.
+* Language Summary::            A terse review of the language.
+* Variables/Fields::            Variables, fields, and arrays.
+* Rules Summary::               Patterns and Actions, and their component
+                                parts.
+* Actions Summary::             Quick overview of actions.
+* Functions Summary::           Defining and calling functions.
+* Historical Features::         Some undocumented but supported ``features''.
+@end menu
+
+@node Command Line Summary, Language Summary, Gawk Summary, Gawk Summary
+@appendixsec Command Line Options Summary
+
+The command line consists of options to @code{gawk} itself, the
+@code{awk} program text (if not supplied via the @samp{-f} option), and
+values to be made available in the @code{ARGC} and @code{ARGV}
+predefined @code{awk} variables:
+
+@example
+gawk @r{[@var{POSIX or GNU style options}]} -f @var{source-file} @r{[@code{--}]} @var{file} @dots{}
+gawk @r{[@var{POSIX or GNU style options}]} @r{[@code{--}]} '@var{program}' @var{file} @dots{}
+@end example
+
+The options that @code{gawk} accepts are:
+
+@table @code
+@item -F @var{fs}
+@itemx --field-separator @var{fs}
+Use @var{fs} for the input field separator (the value of the @code{FS}
+predefined variable).
+
+@item -f @var{program-file}
+@itemx --file @var{program-file}
+Read the @code{awk} program source from the file @var{program-file}, instead
+of from the first command line argument.
+
+@item -mf=@var{NNN}
+@itemx -mr=@var{NNN}
+The @samp{f} flag sets
+the maximum number of fields, and the @samp{r} flag sets the maximum
+record size.  These options are ignored by @code{gawk}, since @code{gawk}
+has no predefined limits; they are only for compatibility with the
+Bell Labs research version of Unix @code{awk}.
+
+@item -v @var{var}=@var{val}
+@itemx --assign @var{var}=@var{val}
+Assign the variable @var{var} the value @var{val} before program execution
+begins.
+
+@item -W traditional
+@itemx -W compat
+@itemx --traditional
+@itemx --compat
+Use compatibility mode, in which @code{gawk} extensions are turned
+off.
+
+@item -W copyleft
+@itemx -W copyright
+@itemx --copyleft
+@itemx --copyright
+Print the short version of the General Public License on the error
+output.  This option may disappear in a future version of @code{gawk}.
+
+@item -W help
+@itemx -W usage
+@itemx --help
+@itemx --usage
+Print a relatively short summary of the available options on the error output.
+
+@item -W lint
+@itemx --lint
+Give warnings about dubious or non-portable @code{awk} constructs.
+
+@item -W lint-old
+@itemx --lint-old
+Warn about constructs that are not available in
+the original Version 7 Unix version of @code{awk}.
+
+@item -W posix
+@itemx --posix
+Use POSIX compatibility mode, in which @code{gawk} extensions
+are turned off and additional restrictions apply.
+
+@item -W re-interval
+@itemx --re-interval
+Allow interval expressions
+(@pxref{Regexp Operators, , Regular Expression Operators}),
+in regexps.
+
+@item -W source=@var{program-text}
+@itemx --source @var{program-text}
+Use @var{program-text} as @code{awk} program source code.  This option allows
+mixing command line source code with source code from files, and is
+particularly useful for mixing command line programs with library functions.
+
+@item -W version
+@itemx --version
+Print version information for this particular copy of @code{gawk} on the error
+output.
+
+@item --
+Signal the end of options.  This is useful to allow further arguments to the
+@code{awk} program itself to start with a @samp{-}.  This is mainly for
+consistency with POSIX argument parsing conventions.
+@end table
+
+Any other options are flagged as invalid, but are otherwise ignored.
+@xref{Options, ,Command Line Options}, for more details.
+
+@node Language Summary, Variables/Fields, Command Line Summary, Gawk Summary
+@appendixsec Language Summary
+
+An @code{awk} program consists of a sequence of zero or more pattern-action
+statements and optional function definitions.  One or the other of the
+pattern and action may be omitted.
+
+@example
+@var{pattern}    @{ @var{action statements} @}
+@var{pattern}
+          @{ @var{action statements} @}
+
+function @var{name}(@var{parameter list})     @{ @var{action statements} @}
+@end example
+
+@code{gawk} first reads the program source from the
+@var{program-file}(s), if specified, or from the first non-option
+argument on the command line.  The @samp{-f} option may be used multiple
+times on the command line.  @code{gawk} reads the program text from all
+the @var{program-file} files, effectively concatenating them in the
+order they are specified.  This is useful for building libraries of
+@code{awk} functions, without having to include them in each new
+@code{awk} program that uses them.  To use a library function in a file
+from a program typed in on the command line, specify
+@samp{--source '@var{program}'}, and type your program in between the single
+quotes.
+@xref{Options, ,Command Line Options}.
+
+The environment variable @code{AWKPATH} specifies a search path to use
+when finding source files named with the @samp{-f} option.  The default
+path, which is
+@samp{.:/usr/local/share/awk}@footnote{The path may use a directory
+other than @file{/usr/local/share/awk}, depending upon how @code{gawk}
+was built and installed.} is used if @code{AWKPATH} is not set.
+If a file name given to the @samp{-f} option contains a @samp{/} character,
+no path search is performed.
+@xref{AWKPATH Variable, ,The @code{AWKPATH} Environment Variable}.
+
+@code{gawk} compiles the program into an internal form, and then proceeds to
+read each file named in the @code{ARGV} array.
+The initial values of @code{ARGV} come from the command line arguments.
+If there are no files named
+on the command line, @code{gawk} reads the standard input.
+
+If a ``file'' named on the command line has the form
+@samp{@var{var}=@var{val}}, it is treated as a variable assignment: the
+variable @var{var} is assigned the value @var{val}.
+If any of the files have a value that is the null string, that
+element in the list is skipped.
+
+For each record in the input, @code{gawk} tests to see if it matches any
+@var{pattern} in the @code{awk} program.  For each pattern that the record
+matches, the associated @var{action} is executed.
+
+@node Variables/Fields, Rules Summary, Language Summary, Gawk Summary
+@appendixsec Variables and Fields
+
+@code{awk} variables are not declared; they come into existence when they are
+first used.  Their values are either floating-point numbers or strings.
+@code{awk} also has one-dimensional arrays; multiple-dimensional arrays
+may be simulated.  There are several predefined variables that
+@code{awk} sets as a program runs; these are summarized below.
+
+@menu
+* Fields Summary::              Input field splitting.
+* Built-in Summary::            @code{awk}'s built-in variables.
+* Arrays Summary::              Using arrays.
+* Data Type Summary::           Values in @code{awk} are numbers or strings.
+@end menu
+
+@node Fields Summary, Built-in Summary, Variables/Fields, Variables/Fields
+@appendixsubsec Fields
+
+As each input line is read, @code{gawk} splits the line into
+@var{fields}, using the value of the @code{FS} variable as the field
+separator.  If @code{FS} is a single character, fields are separated by
+that character.  Otherwise, @code{FS} is expected to be a full regular
+expression.  In the special case that @code{FS} is a single space,
+fields are separated by runs of spaces and/or tabs.
+If @code{FS} is the null string (@code{""}), then each individual
+character in the record becomes a separate field.
+Note that the value
+of @code{IGNORECASE} (@pxref{Case-sensitivity, ,Case-sensitivity in Matching})
+also affects how fields are split when @code{FS} is a regular expression.
+
+Each field in the input line may be referenced by its position, @code{$1},
+@code{$2}, and so on.  @code{$0} is the whole line.  The value of a field may
+be assigned to as well.  Field numbers need not be constants:
+
+@example
+n = 5
+print $n
+@end example
+
+@noindent
+prints the fifth field in the input line.  The variable @code{NF} is set to
+the total number of fields in the input line.
+
+References to non-existent fields (i.e.@: fields after @code{$NF}) return
+the null string.  However, assigning to a non-existent field (e.g.,
+@code{$(NF+2) = 5}) increases the value of @code{NF}, creates any
+intervening fields with the null string as their value, and causes the
+value of @code{$0} to be recomputed, with the fields being separated by
+the value of @code{OFS}.
+@xref{Reading Files, ,Reading Input Files}.
+
+@node Built-in Summary, Arrays Summary, Fields Summary, Variables/Fields
+@appendixsubsec Built-in Variables
+
+@code{gawk}'s built-in variables are:
+
+@table @code
+@item ARGC
+The number of elements in @code{ARGV}. See below for what is actually
+included in @code{ARGV}.
+
+@item ARGIND
+The index in @code{ARGV} of the current file being processed.
+When @code{gawk} is processing the input data files,
+it is always true that @samp{FILENAME == ARGV[ARGIND]}.
+
+@item ARGV
+The array of command line arguments.  The array is indexed from zero to
+@code{ARGC} @minus{} 1.  Dynamically changing @code{ARGC} and
+the contents of @code{ARGV}
+can control the files used for data.  A null-valued element in
+@code{ARGV} is ignored. @code{ARGV} does not include the options to
+@code{awk} or the text of the @code{awk} program itself.
+
+@item CONVFMT
+The conversion format to use when converting numbers to strings.
+
+@item FIELDWIDTHS
+A space separated list of numbers describing the fixed-width input data.
+
+@item ENVIRON
+An array of environment variable values.  The array
+is indexed by variable name, each element being the value of that
+variable.  Thus, the environment variable @code{HOME} is
+@code{ENVIRON["HOME"]}.  One possible value might be @file{/home/arnold}.
+
+Changing this array does not affect the environment seen by programs
+which @code{gawk} spawns via redirection or the @code{system} function.
+(This may change in a future version of @code{gawk}.)
+
+Some operating systems do not have environment variables.
+The @code{ENVIRON} array is empty when running on these systems.
+
+@item ERRNO
+The system error message when an error occurs using @code{getline}
+or @code{close}.
+
+@item FILENAME
+The name of the current input file.  If no files are specified on the command
+line, the value of @code{FILENAME} is the null string.
+
+@item FNR
+The input record number in the current input file.
+
+@item FS
+The input field separator, a space by default.
+
+@item IGNORECASE
+The case-sensitivity flag for string comparisons and regular expression
+operations.  If @code{IGNORECASE} has a non-zero value, then pattern
+matching in rules, record separating with @code{RS}, field splitting
+with @code{FS}, regular expression matching with @samp{~} and
+@samp{!~}, and the @code{gensub}, @code{gsub}, @code{index},
+@code{match}, @code{split} and @code{sub} built-in functions all
+ignore case when doing regular expression operations, and all string
+comparisons are done ignoring case.
+
+@item NF
+The number of fields in the current input record.
+
+@item NR
+The total number of input records seen so far.
+
+@item OFMT
+The output format for numbers for the @code{print} statement,
+@code{"%.6g"} by default.
+
+@item OFS
+The output field separator, a space by default.
+
+@item ORS
+The output record separator, by default a newline.
+
+@item RS
+The input record separator, by default a newline.
+If @code{RS} is set to the null string, then records are separated by
+blank lines.  When @code{RS} is set to the null string, then the newline
+character always acts as a field separator, in addition to whatever value
+@code{FS} may have.  If @code{RS} is set to a multi-character
+string, it denotes a regexp; input text matching the regexp
+separates records.
+
+@item RT
+The input text that matched the text denoted by @code{RS},
+the record separator.
+
+@item RSTART
+The index of the first character last matched by @code{match}; zero if no match.
+
+@item RLENGTH
+The length of the string last matched by @code{match}; @minus{}1 if no match.
+
+@item SUBSEP
+The string used to separate multiple subscripts in array elements, by
+default @code{"\034"}.
+@end table
+
+@xref{Built-in Variables}, for more information.
+
+@node Arrays Summary, Data Type Summary, Built-in Summary, Variables/Fields
+@appendixsubsec Arrays
+
+Arrays are subscripted with an expression between square brackets
+(@samp{[} and @samp{]}).  Array subscripts are @emph{always} strings;
+numbers are converted to strings as necessary, following the standard
+conversion rules
+(@pxref{Conversion, ,Conversion of Strings and Numbers}).
+
+If you use multiple expressions separated by commas inside the square
+brackets, then the array subscript is a string consisting of the
+concatenation of the individual subscript values, converted to strings,
+separated by the subscript separator (the value of @code{SUBSEP}).
+
+The special operator @code{in} may be used in a conditional context
+to see if an array has an index consisting of a particular value.
+
+@example
+if (val in array)
+        print array[val]
+@end example
+
+If the array has multiple subscripts, use @samp{(i, j, @dots{}) in @var{array}}
+to test for existence of an element.
+
+The @code{in} construct may also be used in a @code{for} loop to iterate
+over all the elements of an array.
+@xref{Scanning an Array, ,Scanning All Elements of an Array}.
+
+You can remove an element from an array using the @code{delete} statement.
+
+You can clear an entire array using @samp{delete @var{array}}.
+
+@xref{Arrays, ,Arrays in @code{awk}}.
+
+@node Data Type Summary,  , Arrays Summary, Variables/Fields
+@appendixsubsec Data Types
+
+The value of an @code{awk} expression is always either a number
+or a string.
+
+Some contexts (such as arithmetic operators) require numeric
+values.  They convert strings to numbers by interpreting the text
+of the string as a number.  If the string does not look like a
+number, it converts to zero.
+
+Other contexts (such as concatenation) require string values.
+They convert numbers to strings by effectively printing them
+with @code{sprintf}.
+@xref{Conversion, ,Conversion of Strings and Numbers}, for the details.
+
+To force conversion of a string value to a number, simply add zero
+to it.  If the value you start with is already a number, this
+does not change it.
+
+To force conversion of a numeric value to a string, concatenate it with
+the null string.
+
+Comparisons are done numerically if both operands are numeric, or if
+one is numeric and the other is a numeric string.  Otherwise one or
+both operands are converted to strings and a string comparison is
+performed.  Fields, @code{getline} input, @code{FILENAME}, @code{ARGV}
+elements, @code{ENVIRON} elements and the elements of an array created
+by @code{split} are the only items that can be numeric strings. String
+constants, such as @code{"3.1415927"} are not numeric strings, they are
+string constants.  The full rules for comparisons are described in
+@ref{Typing and Comparison, ,Variable Typing and Comparison Expressions}.
+
+Uninitialized variables have the string value @code{""} (the null, or
+empty, string).  In contexts where a number is required, this is
+equivalent to zero.
+
+@xref{Variables}, for more information on variable naming and initialization;
+@pxref{Conversion, ,Conversion of Strings and Numbers}, for more information
+on how variable values are interpreted.
+
+@node Rules Summary, Actions Summary, Variables/Fields, Gawk Summary
+@appendixsec Patterns
+
+@menu
+* Pattern Summary::             Quick overview of patterns.
+* Regexp Summary::              Quick overview of regular expressions.
+@end menu
+
+An @code{awk} program is mostly composed of rules, each consisting of a
+pattern followed by an action.  The action is enclosed in @samp{@{} and
+@samp{@}}.  Either the pattern may be missing, or the action may be
+missing, but not both.  If the pattern is missing, the
+action is executed for every input record.  A missing action is
+equivalent to @samp{@w{@{ print @}}}, which prints the entire line.
+
+@c These paragraphs repeated for both patterns and actions. I don't
+@c like this, but I also don't see any way around it. Update both copies
+@c if they need fixing.
+Comments begin with the @samp{#} character, and continue until the end of the
+line.  Blank lines may be used to separate statements.  Statements normally
+end with a newline; however, this is not the case for lines ending in a
+@samp{,}, @samp{@{}, @samp{?}, @samp{:}, @samp{&&}, or @samp{||}.  Lines
+ending in @code{do} or @code{else} also have their statements automatically
+continued on the following line.  In other cases, a line can be continued by
+ending it with a @samp{\}, in which case the newline is ignored.
+
+Multiple statements may be put on one line by separating each one with
+a @samp{;}.
+This applies to both the statements within the action part of a rule (the
+usual case), and to the rule statements.
+
+@xref{Comments, ,Comments in @code{awk} Programs}, for information on
+@code{awk}'s commenting convention;
+@pxref{Statements/Lines, ,@code{awk} Statements Versus Lines}, for a
+description of the line continuation mechanism in @code{awk}.
+
+@node Pattern Summary, Regexp Summary, Rules Summary, Rules Summary
+@appendixsubsec Pattern Summary
+
+@code{awk} patterns may be one of the following:
+
+@example
+/@var{regular expression}/
+@var{relational expression}
+@var{pattern} && @var{pattern}
+@var{pattern} || @var{pattern}
+@var{pattern} ? @var{pattern} : @var{pattern}
+(@var{pattern})
+! @var{pattern}
+@var{pattern1}, @var{pattern2}
+BEGIN
+END
+@end example
+
+@code{BEGIN} and @code{END} are two special kinds of patterns that are not
+tested against the input.  The action parts of all @code{BEGIN} rules are
+concatenated as if all the statements had been written in a single @code{BEGIN}
+rule.  They are executed before any of the input is read.  Similarly, all the
+@code{END} rules are concatenated, and executed when all the input is exhausted (or
+when an @code{exit} statement is executed).  @code{BEGIN} and @code{END}
+patterns cannot be combined with other patterns in pattern expressions.
+@code{BEGIN} and @code{END} rules cannot have missing action parts.
+
+For @code{/@var{regular-expression}/} patterns, the associated statement is
+executed for each input record that matches the regular expression.  Regular
+expressions are summarized below.
+
+A @var{relational expression} may use any of the operators defined below in
+the section on actions.  These generally test whether certain fields match
+certain regular expressions.
+
+The @samp{&&}, @samp{||}, and @samp{!} operators are logical ``and,''
+logical ``or,'' and logical ``not,'' respectively, as in C.  They do
+short-circuit evaluation, also as in C, and are used for combining more
+primitive pattern expressions.  As in most languages, parentheses may be
+used to change the order of evaluation.
+
+The @samp{?:} operator is like the same operator in C.  If the first
+pattern matches, then the second pattern is matched against the input
+record; otherwise, the third is matched.  Only one of the second and
+third patterns is matched.
+
+The @samp{@var{pattern1}, @var{pattern2}} form of a pattern is called a
+range pattern.  It matches all input lines starting with a line that
+matches @var{pattern1}, and continuing until a line that matches
+@var{pattern2}, inclusive.  A range pattern cannot be used as an operand
+of any of the pattern operators.
+
+@xref{Pattern Overview, ,Pattern Elements}.
+
+@node Regexp Summary, , Pattern Summary, Rules Summary
+@appendixsubsec Regular Expressions
+
+Regular expressions are based on POSIX EREs (extended regular expressions).
+The escape sequences allowed in string constants are also valid in
+regular expressions (@pxref{Escape Sequences}).
+Regexps are composed of characters as follows:
+
+@table @code
+@item @var{c}
+matches the character @var{c} (assuming @var{c} is none of the characters
+listed below).
+
+@item \@var{c}
+matches the literal character @var{c}.
+
+@item .
+matches any character, @emph{including} newline.
+In strict POSIX mode, @samp{.} does not match the @sc{nul}
+character, which is a character with all bits equal to zero.
+
+@item ^
+matches the beginning of a string.
+
+@item $
+matches the end of a string.
+
+@item [@var{abc}@dots{}]
+matches any of the characters @var{abc}@dots{} (character list).
+
+@item [[:@var{class}:]]
+matches any character in the character class @var{class}. Allowable classes
+are @code{alnum}, @code{alpha}, @code{blank}, @code{cntrl},
+@code{digit}, @code{graph}, @code{lower}, @code{print}, @code{punct},
+@code{space}, @code{upper}, and @code{xdigit}.
+
+@item [[.@var{symbol}.]]
+matches the multi-character collating symbol @var{symbol}.
+@code{gawk} does not currently support collating symbols.
+
+@item [[=@var{chars}=]]
+matches any of the equivalent characters in @var{chars}.
+@code{gawk} does not currently support equivalence classes.
+
+@item [^@var{abc}@dots{}]
+matches any character except @var{abc}@dots{} and newline (negated
+character list).
+
+@item @var{r1}|@var{r2}
+matches either @var{r1} or @var{r2} (alternation).
+
+@item @var{r1r2}
+matches @var{r1}, and then @var{r2} (concatenation).
+
+@item @var{r}+
+matches one or more @var{r}'s.
+
+@item @var{r}*
+matches zero or more @var{r}'s. 
+
+@item @var{r}?
+matches zero or one @var{r}'s. 
+
+@item (@var{r})
+matches @var{r} (grouping).
+
+@item @var{r}@{@var{n}@}
+@itemx @var{r}@{@var{n},@}
+@itemx @var{r}@{@var{n},@var{m}@}
+matches at least @var{n}, @var{n} to any number, or @var{n} to @var{m}
+occurrences of @var{r} (interval expressions).
+
+@item \y
+matches the empty string at either the beginning or the
+end of a word.
+
+@item \B
+matches the empty string within a word.
+
+@item \<
+matches the empty string at the beginning of a word.
+
+@item \>
+matches the empty string at the end of a word.
+
+@item \w
+matches any word-constituent character (alphanumeric characters and
+the underscore).
+
+@item \W
+matches any character that is not word-constituent.
+
+@item \`
+matches the empty string at the beginning of a buffer (same as a string
+in @code{gawk}).
+
+@item \'
+matches the empty string at the end of a buffer.
+@end table
+
+The various command line options
+control how @code{gawk} interprets characters in regexps.
+
+@c NOTE!!! Keep this in sync with the same table in the regexp chapter!
+@table @asis
+@item No options
+In the default case, @code{gawk} provide all the facilities of
+POSIX regexps and the GNU regexp operators described above.
+However, interval expressions are not supported.
+
+@item @code{--posix}
+Only POSIX regexps are supported, the GNU operators are not special
+(e.g., @samp{\w} matches a literal @samp{w}).  Interval expressions
+are allowed.
+
+@item @code{--traditional}
+Traditional Unix @code{awk} regexps are matched. The GNU operators
+are not special, interval expressions are not available, and neither
+are the POSIX character classes (@code{[[:alnum:]]} and so on).
+Characters described by octal and hexadecimal escape sequences are
+treated literally, even if they represent regexp metacharacters.
+
+@item @code{--re-interval}
+Allow interval expressions in regexps, even if @samp{--traditional}
+has been provided.
+@end table
+
+@xref{Regexp, ,Regular Expressions}.
+
+@node Actions Summary, Functions Summary, Rules Summary, Gawk Summary
+@appendixsec Actions
+
+Action statements are enclosed in braces, @samp{@{} and @samp{@}}.
+A missing action statement is equivalent to @samp{@w{@{ print @}}}.
+
+Action statements consist of the usual assignment, conditional, and looping
+statements found in most languages.  The operators, control statements,
+and Input/Output statements available are similar to those in C.
+
+@c These paragraphs repeated for both patterns and actions. I don't
+@c like this, but I also don't see any way around it. Update both copies
+@c if they need fixing.
+Comments begin with the @samp{#} character, and continue until the end of the
+line.  Blank lines may be used to separate statements.  Statements normally
+end with a newline; however, this is not the case for lines ending in a
+@samp{,}, @samp{@{}, @samp{?}, @samp{:}, @samp{&&}, or @samp{||}.  Lines
+ending in @code{do} or @code{else} also have their statements automatically
+continued on the following line.  In other cases, a line can be continued by
+ending it with a @samp{\}, in which case the newline is ignored.
+
+Multiple statements may be put on one line by separating each one with
+a @samp{;}.
+This applies to both the statements within the action part of a rule (the
+usual case), and to the rule statements.
+
+@xref{Comments, ,Comments in @code{awk} Programs}, for information on
+@code{awk}'s commenting convention;
+@pxref{Statements/Lines, ,@code{awk} Statements Versus Lines}, for a
+description of the line continuation mechanism in @code{awk}.
+
+@menu
+* Operator Summary::            @code{awk} operators.
+* Control Flow Summary::        The control statements.
+* I/O Summary::                 The I/O statements.
+* Printf Summary::              A summary of @code{printf}.
+* Special File Summary::        Special file names interpreted internally.
+* Built-in Functions Summary::  Built-in numeric and string functions.
+* Time Functions Summary::      Built-in time functions.
+* String Constants Summary::    Escape sequences in strings.
+@end menu
+
+@node Operator Summary, Control Flow Summary, Actions Summary, Actions Summary
+@appendixsubsec Operators
+
+The operators in @code{awk}, in order of decreasing precedence, are:
+
+@table @code
+@item (@dots{})
+Grouping.
+
+@item $
+Field reference.
+
+@item ++ --
+Increment and decrement, both prefix and postfix.
+
+@item ^
+Exponentiation (@samp{**} may also be used, and @samp{**=} for the assignment
+operator, but they are not specified in the POSIX standard).
+
+@item + - !
+Unary plus, unary minus, and logical negation.
+
+@item * / %
+Multiplication, division, and modulus.
+
+@item + -
+Addition and subtraction.
+
+@item @var{space}
+String concatenation.
+
+@item < <= > >= != ==
+The usual relational operators.
+
+@item ~ !~
+Regular expression match, negated match.
+
+@item in
+Array membership.
+
+@item &&
+Logical ``and''.
+
+@item ||
+Logical ``or''.
+
+@item ?:
+A conditional expression.  This has the form @samp{@var{expr1} ?
+@var{expr2} : @var{expr3}}.  If @var{expr1} is true, the value of the
+expression is @var{expr2}; otherwise it is @var{expr3}.  Only one of
+@var{expr2} and @var{expr3} is evaluated.
+
+@item = += -= *= /= %= ^=
+Assignment.  Both absolute assignment (@code{@var{var}=@var{value}})
+and operator assignment (the other forms) are supported.
+@end table
+
+@xref{Expressions}.
+
+@node Control Flow Summary, I/O Summary, Operator Summary, Actions Summary
+@appendixsubsec Control Statements
+
+The control statements are as follows:
+
+@example
+if (@var{condition}) @var{statement} @r{[} else @var{statement} @r{]}
+while (@var{condition}) @var{statement}
+do @var{statement} while (@var{condition})
+for (@var{expr1}; @var{expr2}; @var{expr3}) @var{statement}
+for (@var{var} in @var{array}) @var{statement}
+break
+continue
+delete @var{array}[@var{index}]
+delete @var{array}
+exit @r{[} @var{expression} @r{]}
+@{ @var{statements} @}
+@end example
+
+@xref{Statements, ,Control Statements in Actions}.
+
+@node I/O Summary, Printf Summary, Control Flow Summary, Actions Summary
+@appendixsubsec I/O Statements
+
+The Input/Output statements are as follows:
+
+@table @code
+@item getline
+Set @code{$0} from next input record; set @code{NF}, @code{NR}, @code{FNR}.
+@xref{Getline, ,Explicit Input with @code{getline}}.
+
+@item getline <@var{file}
+Set @code{$0} from next record of @var{file}; set @code{NF}.
+
+@item getline @var{var}
+Set @var{var} from next input record; set @code{NF}, @code{FNR}.
+
+@item getline @var{var} <@var{file}
+Set @var{var} from next record of @var{file}.
+
+@item @var{command} | getline
+Run @var{command}, piping its output into @code{getline}; sets @code{$0},
+@code{NF}, @code{NR}.
+
+@item @var{command} | getline @code{var}
+Run @var{command}, piping its output into @code{getline}; sets @var{var}.
+
+@item next
+Stop processing the current input record.  The next input record is read and
+processing starts over with the first pattern in the @code{awk} program.
+If the end of the input data is reached, the @code{END} rule(s), if any,
+are executed.
+@xref{Next Statement, ,The @code{next} Statement}.
+
+@item nextfile
+Stop processing the current input file.  The next input record read comes
+from the next input file.  @code{FILENAME} is updated, @code{FNR} is set to one,
+@code{ARGIND} is incremented, 
+and processing starts over with the first pattern in the @code{awk} program.
+If the end of the input data is reached, the @code{END} rule(s), if any,
+are executed.
+Earlier versions of @code{gawk} used @samp{next file}; this usage is still
+supported, but is considered to be deprecated.
+@xref{Nextfile Statement, ,The @code{nextfile} Statement}.
+
+@item print
+Prints the current record.
+@xref{Printing, ,Printing Output}.
+
+@item print @var{expr-list}
+Prints expressions.
+
+@item print @var{expr-list} > @var{file}
+Prints expressions to @var{file}. If @var{file} does not exist, it is
+created. If it does exist, its contents are deleted the first time the
+@code{print} is executed.
+
+@item print @var{expr-list} >> @var{file}
+Prints expressions to @var{file}.  The previous contents of @var{file}
+are retained, and the output of @code{print} is appended to the file.
+
+@item print @var{expr-list} | @var{command}
+Prints expressions, sending the output down a pipe to @var{command}.
+The pipeline to the command stays open until the @code{close} function
+is called.
+
+@item printf @var{fmt, expr-list}
+Format and print.
+
+@item printf @var{fmt, expr-list} > file
+Format and print to @var{file}. If @var{file} does not exist, it is
+created. If it does exist, its contents are deleted the first time the
+@code{printf} is executed.
+
+@item printf @var{fmt, expr-list} >> @var{file}
+Format and print to @var{file}.  The previous contents of @var{file}
+are retained, and the output of @code{printf} is appended to the file.
+
+@item printf @var{fmt, expr-list} | @var{command}
+Format and print, sending the output down a pipe to @var{command}.
+The pipeline to the command stays open until the @code{close} function
+is called.
+@end table
+
+@code{getline} returns zero on end of file, and @minus{}1 on an error.
+In the event of an error, @code{getline} will set @code{ERRNO} to
+the value of a system-dependent string that describes the error.
+
+@node Printf Summary, Special File Summary, I/O Summary, Actions Summary
+@appendixsubsec @code{printf} Summary
+
+Conversion specification have the form
+@code{%}[@var{flag}][@var{width}][@code{.}@var{prec}]@var{format}.
+@c whew!
+Items in brackets are optional.
+
+The @code{awk} @code{printf} statement and @code{sprintf} function
+accept the following conversion specification formats:
+
+@table @code
+@item %c
+An ASCII character.  If the argument used for @samp{%c} is numeric, it is
+treated as a character and printed.  Otherwise, the argument is assumed to
+be a string, and the only first character of that string is printed.
+
+@item %d
+@itemx %i
+A decimal number (the integer part).
+
+@item %e
+@itemx %E
+A floating point number of the form
+@samp{@r{[}-@r{]}d.dddddde@r{[}+-@r{]}dd}.
+The @samp{%E} format uses @samp{E} instead of @samp{e}.
+
+@item %f
+A floating point number of the form
+@r{[}@code{-}@r{]}@code{ddd.dddddd}.
+
+@item %g
+@itemx %G
+Use either the @samp{%e} or @samp{%f} formats, whichever produces a shorter
+string, with non-significant zeros suppressed.
+@samp{%G} will use @samp{%E} instead of @samp{%e}.
+
+@item %o
+An unsigned octal number (again, an integer).
+
+@item %s
+A character string.
+
+@item %x
+@itemx %X
+An unsigned hexadecimal number (an integer).
+The @samp{%X} format uses @samp{A} through @samp{F} instead of
+@samp{a} through @samp{f} for decimal 10 through 15.
+
+@item %%
+A single @samp{%} character; no argument is converted.
+@end table
+
+There are optional, additional parameters that may lie between the @samp{%}
+and the control letter:
+
+@table @code
+@item -
+The expression should be left-justified within its field.
+
+@item @var{space}
+For numeric conversions, prefix positive values with a space, and
+negative values with a minus sign.
+
+@item +
+The plus sign, used before the width modifier (see below),
+says to always supply a sign for numeric conversions, even if the data
+to be formatted is positive. The @samp{+} overrides the space modifier.
+
+@item #
+Use an ``alternate form'' for certain control letters.
+For @samp{o}, supply a leading zero.
+For @samp{x}, and @samp{X}, supply a leading @samp{0x} or @samp{0X} for
+a non-zero result.
+For @samp{e}, @samp{E}, and @samp{f}, the result will always contain a
+decimal point.
+For @samp{g}, and @samp{G}, trailing zeros are not removed from the result.
+
+@item 0
+A leading @samp{0} (zero) acts as a flag, that indicates output should be
+padded with zeros instead of spaces.
+This applies even to non-numeric output formats.
+This flag only has an effect when the field width is wider than the
+value to be printed.
+
+@item @var{width}
+The field should be padded to this width. The field is normally padded
+with spaces.  If the @samp{0} flag has been used, it is padded with zeros.
+
+@item .@var{prec}
+A number that specifies the precision to use when printing.
+For the @samp{e}, @samp{E}, and @samp{f} formats, this specifies the
+number of digits you want printed to the right of the decimal point.
+For the @samp{g}, and @samp{G} formats, it specifies the maximum number
+of significant digits.  For the @samp{d}, @samp{o}, @samp{i}, @samp{u},
+@samp{x}, and @samp{X} formats, it specifies the minimum number of
+digits to print.  For the @samp{s} format, it specifies the maximum number of
+characters from the string that should be printed.
+@end table
+
+Either or both of the @var{width} and @var{prec} values may be specified
+as @samp{*}.  In that case, the particular value is taken from the argument
+list.
+
+@xref{Printf, ,Using @code{printf} Statements for Fancier Printing}.
+
+@node Special File Summary, Built-in Functions Summary, Printf Summary, Actions Summary
+@appendixsubsec Special File Names
+
+When doing I/O redirection from either @code{print} or @code{printf} into a
+file, or via @code{getline} from a file, @code{gawk} recognizes certain special
+file names internally.  These file names allow access to open file descriptors
+inherited from @code{gawk}'s parent process (usually the shell).  The
+file names are:
+
+@table @file
+@item /dev/stdin
+The standard input.
+
+@item /dev/stdout
+The standard output.
+
+@item /dev/stderr
+The standard error output.
+
+@item /dev/fd/@var{n}
+The file denoted by the open file descriptor @var{n}.
+@end table
+
+In addition, reading the following files provides process related information
+about the running @code{gawk} program.  All returned records are terminated
+with a newline.
+
+@table @file
+@item /dev/pid
+Returns the process ID of the current process.
+
+@item  /dev/ppid
+Returns the parent process ID of the current process.
+
+@item  /dev/pgrpid
+Returns the process group ID of the current process.
+
+@item /dev/user
+At least four space-separated fields, containing the return values of
+the @code{getuid}, @code{geteuid}, @code{getgid}, and @code{getegid}
+system calls.
+If there are any additional fields, they are the group IDs returned by
+@code{getgroups} system call.
+(Multiple groups may not be supported on all systems.)
+@end table
+
+@noindent
+These file names may also be used on the command line to name data files.
+These file names are only recognized internally if you do not
+actually have files with these names on your system.
+
+@xref{Special Files, ,Special File Names in @code{gawk}}, for a longer description that
+provides the motivation for this feature.
+
+@node Built-in Functions Summary, Time Functions Summary, Special File Summary, Actions Summary
+@appendixsubsec Built-in Functions
+
+@code{awk} provides a number of built-in functions for performing
+numeric operations, string related operations, and I/O related operations.
+
+The built-in arithmetic functions are:
+
+@table @code
+@item atan2(@var{y}, @var{x})
+the arctangent of @var{y/x} in radians.
+
+@item cos(@var{expr})
+the cosine in radians.
+
+@item exp(@var{expr})
+the exponential function (@code{e ^ @var{expr}}).
+
+@item int(@var{expr})
+truncates to integer.
+
+@item log(@var{expr})
+the natural logarithm of @code{expr}.
+
+@item rand()
+a random number between zero and one.
+
+@item sin(@var{expr})
+the sine in radians.
+
+@item sqrt(@var{expr})
+the square root function.
+
+@item srand(@r{[}@var{expr}@r{]})
+use @var{expr} as a new seed for the random number generator.  If no @var{expr}
+is provided, the time of day is used.  The return value is the previous
+seed for the random number generator.
+@end table
+
+@iftex
+@page
+@end iftex
+@code{awk} has the following built-in string functions:
+
+@table @code
+@item gensub(@var{regex}, @var{subst}, @var{how} @r{[}, @var{target}@r{]})
+If @var{how} is a string beginning with @samp{g} or @samp{G}, then
+replace each match of @var{regex} in @var{target} with @var{subst}.
+Otherwise, replace the @var{how}'th occurrence. If @var{target} is not
+supplied, use @code{$0}.  The return value is the changed string; the
+original @var{target} is not modified. Within @var{subst},
+@samp{\@var{n}}, where @var{n} is a digit from one to nine, can be used to
+indicate the text that matched the @var{n}'th parenthesized
+subexpression.
+
+@item gsub(@var{regex}, @var{subst} @r{[}, @var{target}@r{]})
+for each substring matching the regular expression @var{regex} in the string
+@var{target}, substitute the string @var{subst}, and return the number of
+substitutions. If @var{target} is not supplied, use @code{$0}.
+
+@item index(@var{str}, @var{search})
+returns the index of the string @var{search} in the string @var{str}, or
+zero if
+@var{search} is not present.
+
+@item length(@r{[}@var{str}@r{]})
+returns the length of the string @var{str}.  The length of @code{$0}
+is returned if no argument is supplied.
+
+@item match(@var{str}, @var{regex})
+returns the position in @var{str} where the regular expression @var{regex}
+occurs, or zero if @var{regex} is not present, and sets the values of
+@code{RSTART} and @code{RLENGTH}.
+
+@item split(@var{str}, @var{arr} @r{[}, @var{regex}@r{]})
+splits the string @var{str} into the array @var{arr} on the regular expression
+@var{regex}, and returns the number of elements.  If @var{regex} is omitted,
+@code{FS} is used instead. @var{regex} can be the null string, causing
+each character to be placed into its own array element.
+The array @var{arr} is cleared first.
+
+@item sprintf(@var{fmt}, @var{expr-list})
+prints @var{expr-list} according to @var{fmt}, and returns the resulting string.
+
+@item sub(@var{regex}, @var{subst} @r{[}, @var{target}@r{]})
+just like @code{gsub}, but only the first matching substring is replaced.
+
+@item substr(@var{str}, @var{index} @r{[}, @var{len}@r{]})
+returns the @var{len}-character substring of @var{str} starting at @var{index}.
+If @var{len} is omitted, the rest of @var{str} is used.
+
+@item tolower(@var{str})
+returns a copy of the string @var{str}, with all the upper-case characters in
+@var{str} translated to their corresponding lower-case counterparts.
+Non-alphabetic characters are left unchanged.
+
+@item toupper(@var{str})
+returns a copy of the string @var{str}, with all the lower-case characters in
+@var{str} translated to their corresponding upper-case counterparts.
+Non-alphabetic characters are left unchanged.
+@end table
+
+The I/O related functions are:
+
+@table @code
+@item close(@var{expr})
+Close the open file or pipe denoted by @var{expr}.
+
+@item fflush(@r{[}@var{expr}@r{]})
+Flush any buffered output for the output file or pipe denoted by @var{expr}.
+If @var{expr} is omitted, standard output is flushed.
+If @var{expr} is the null string (@code{""}), all output buffers are flushed.
+
+@item system(@var{cmd-line})
+Execute the command @var{cmd-line}, and return the exit status.
+If your operating system does not support @code{system}, calling it will
+generate a fatal error.
+
+@samp{system("")} can be used to force @code{awk} to flush any pending
+output.  This is more portable, but less obvious, than calling @code{fflush}.
+@end table
+
+@node Time Functions Summary, String Constants Summary, Built-in Functions Summary, Actions Summary
+@appendixsubsec Time Functions
+
+The following two functions are available for getting the current
+time of day, and for formatting time stamps.
+
+@table @code
+@item systime()
+returns the current time of day as the number of seconds since a particular
+epoch (Midnight, January 1, 1970 UTC, on POSIX systems).
+
+@item strftime(@r{[}@var{format}@r{[}, @var{timestamp}@r{]]})
+formats @var{timestamp} according to the specification in @var{format}.
+The current time of day is used if no @var{timestamp} is supplied.
+A default format equivalent to the output of the @code{date} utility is used if
+no @var{format} is supplied.
+@xref{Time Functions, ,Functions for Dealing with Time Stamps}, for the
+details on the conversion specifiers that @code{strftime} accepts.
+@end table
+
+@iftex
+@xref{Built-in, ,Built-in Functions}, for a description of all of
+@code{awk}'s built-in functions.
+@end iftex
+
+@node String Constants Summary,  , Time Functions Summary, Actions Summary
+@appendixsubsec String Constants
+
+String constants in @code{awk} are sequences of characters enclosed
+in double quotes (@code{"}).  Within strings, certain @dfn{escape sequences}
+are recognized, as in C.  These are:
+
+@table @code
+@item \\
+A literal backslash.
+
+@item \a
+The ``alert'' character; usually the ASCII BEL character.
+
+@item \b
+Backspace.
+
+@item \f
+Formfeed.
+
+@item \n
+Newline.
+
+@item \r
+Carriage return.
+
+@item \t
+Horizontal tab.
+
+@item \v
+Vertical tab.
+
+@item \x@var{hex digits}
+The character represented by the string of hexadecimal digits following
+the @samp{\x}.  As in ANSI C, all following hexadecimal digits are
+considered part of the escape sequence.  E.g., @code{"\x1B"} is a
+string containing the ASCII ESC (escape) character.  (The @samp{\x}
+escape sequence is not in POSIX @code{awk}.)
+
+@item \@var{ddd}
+The character represented by the one, two, or three digit sequence of octal
+digits.  Thus, @code{"\033"} is also a string containing the ASCII ESC
+(escape) character.
+
+@item \@var{c}
+The literal character @var{c}, if @var{c} is not one of the above.
+@end table
+
+The escape sequences may also be used inside constant regular expressions
+(e.g., the regexp @code{@w{/[@ \t\f\n\r\v]/}} matches whitespace
+characters).
+
+@xref{Escape Sequences}.
+
+@node Functions Summary, Historical Features, Actions Summary, Gawk Summary
+@appendixsec User-defined Functions
+
+Functions in @code{awk} are defined as follows:
+
+@example
+function @var{name}(@var{parameter list}) @{ @var{statements} @}
+@end example
+
+Actual parameters supplied in the function call are used to instantiate
+the formal parameters declared in the function.  Arrays are passed by
+reference, other variables are passed by value.
+
+If there are fewer arguments passed than there are names in @var{parameter-list},
+the extra names are given the null string as their value.  Extra names have the
+effect of local variables.
+
+The open-parenthesis in a function call of a user-defined function must
+immediately follow the function name, without any intervening white space.
+This is to avoid a syntactic ambiguity with the concatenation operator.
+
+The word @code{func} may be used in place of @code{function} (but not in
+POSIX @code{awk}).
+
+Use the @code{return} statement to return a value from a function.
+
+@xref{User-defined, ,User-defined Functions}.
+
+@node Historical Features,  , Functions Summary, Gawk Summary
+@appendixsec Historical Features
+
+@cindex historical features
+There are two features of historical @code{awk} implementations that
+@code{gawk} supports.
+
+First, it is possible to call the @code{length} built-in function not only
+with no arguments, but even without parentheses!
+
+@example
+a = length
+@end example
+
+@noindent
+is the same as either of
+
+@example
+a = length()
+a = length($0)
+@end example
+
+@noindent
+For example:
+
+@example
+$ echo abcdef | awk '@{ print length @}'
+@print{} 6
+@end example
+
+@noindent
+This feature is marked as ``deprecated'' in the POSIX standard, and
+@code{gawk} will issue a warning about its use if @samp{--lint} is
+specified on the command line.
+(The ability to use @code{length} this way was actually an accident of the
+original Unix @code{awk} implementation.  If any built-in function used
+@code{$0} as its default argument, it was possible to call that function
+without the parentheses.  In particular, it was common practice to use
+the @code{length} function in this fashion, and this usage was documented
+in the @code{awk} manual page.)
+
+The other historical feature is the use of either the @code{break} statement,
+or the @code{continue} statement
+outside the body of a @code{while}, @code{for}, or @code{do} loop.  Traditional
+@code{awk} implementations have treated such usage as equivalent to the
+@code{next} statement.  More recent versions of Unix @code{awk} do not allow
+it. @code{gawk} supports this usage if @samp{--traditional} has been
+specified.
+
+@xref{Options, ,Command Line Options}, for more information about the
+@samp{--posix} and @samp{--lint} options.
+
+@node Installation, Notes, Gawk Summary, Top
+@appendix Installing @code{gawk}
+
+This appendix provides instructions for installing @code{gawk} on the
+various platforms that are supported by the developers.  The primary
+developers support Unix (and one day, GNU), while the other ports were
+contributed.  The file @file{ACKNOWLEDGMENT} in the @code{gawk}
+distribution lists the electronic mail addresses of the people who did
+the respective ports, and they are also provided in
+@ref{Bugs, , Reporting Problems and Bugs}.
+
+@menu
+* Gawk Distribution::           What is in the @code{gawk} distribution.
+* Unix Installation::           Installing @code{gawk} under various versions
+                                of Unix.
+* VMS Installation::            Installing @code{gawk} on VMS.
+* PC Installation::             Installing and Compiling @code{gawk} on MS-DOS
+                                and OS/2
+* Atari Installation::          Installing @code{gawk} on the Atari ST.
+* Amiga Installation::          Installing @code{gawk} on an Amiga.
+* Bugs::                        Reporting Problems and Bugs.
+* Other Versions::              Other freely available @code{awk}
+                                implementations.
+@end menu
+
+@node Gawk Distribution, Unix Installation, Installation, Installation
+@appendixsec The @code{gawk} Distribution
+
+This section first describes how to get the @code{gawk}
+distribution, how to extract it, and then what is in the various files and
+subdirectories.
+
+@menu
+* Getting::                     How to get the distribution.
+* Extracting::                  How to extract the distribution.
+* Distribution contents::       What is in the distribution.
+@end menu
+
+@node Getting, Extracting, Gawk Distribution, Gawk Distribution
+@appendixsubsec Getting the @code{gawk} Distribution
+@cindex getting @code{gawk}
+@cindex anonymous @code{ftp}
+@cindex @code{ftp}, anonymous
+@cindex Free Software Foundation
+There are three ways you can get GNU software.
+
+@enumerate
+@item
+You can copy it from someone else who already has it.
+
+@cindex Free Software Foundation
+@item
+You can order @code{gawk} directly from the Free Software Foundation.
+Software distributions are available for Unix, MS-DOS, and VMS, on
+tape, CD-ROM, or floppies (MS-DOS only).  The address is:
+
+@quotation
+Free Software Foundation @*
+59 Temple Place---Suite 330 @*
+Boston, MA  02111-1307 USA @*
+Phone: +1-617-542-5942 @*
+Fax (including Japan): +1-617-542-2652 @*
+E-mail: @code{gnu@@prep.ai.mit.edu} @*
+@end quotation
+
+@noindent
+Ordering from the FSF directly contributes to the support of the foundation
+and to the production of more free software.
+
+@item
+You can get @code{gawk} by using anonymous @code{ftp} to the Internet host
+@code{ftp.gnu.ai.mit.edu}, in the directory @file{/pub/gnu}.
+
+Here is a list of alternate @code{ftp} sites from which you can obtain GNU
+software.  When a site is listed as ``@var{site}@code{:}@var{directory}'' the
+@var{directory} indicates the directory where GNU software is kept.
+You should use a site that is geographically close to you.
+
+@table @asis
+@item Asia:
+@table @code
+@item cair-archive.kaist.ac.kr:/pub/gnu
+@itemx ftp.cs.titech.ac.jp
+@itemx ftp.nectec.or.th:/pub/mirrors/gnu
+@itemx utsun.s.u-tokyo.ac.jp:/ftpsync/prep
+@end table
+
+@item Australia:
+@table @code
+@item archie.au:/gnu
+(@code{archie.oz} or @code{archie.oz.au} for ACSnet)
+@end table
+
+@item Africa:
+@table @code
+@item ftp.sun.ac.za:/pub/gnu
+@end table
+
+@item Middle East:
+@table @code
+@item ftp.technion.ac.il:/pub/unsupported/gnu
+@end table
+
+@item Europe:
+@table @code
+@item archive.eu.net
+@itemx ftp.denet.dk
+@itemx ftp.eunet.ch
+@itemx ftp.funet.fi:/pub/gnu
+@itemx ftp.ieunet.ie:pub/gnu
+@itemx ftp.informatik.rwth-aachen.de:/pub/gnu
+@itemx ftp.informatik.tu-muenchen.de
+@itemx ftp.luth.se:/pub/unix/gnu
+@itemx ftp.mcc.ac.uk
+@itemx ftp.stacken.kth.se
+@itemx ftp.sunet.se:/pub/gnu
+@itemx ftp.univ-lyon1.fr:pub/gnu
+@itemx ftp.win.tue.nl:/pub/gnu
+@itemx irisa.irisa.fr:/pub/gnu
+@itemx isy.liu.se
+@itemx nic.switch.ch:/mirror/gnu
+@itemx src.doc.ic.ac.uk:/gnu
+@itemx unix.hensa.ac.uk:/pub/uunet/systems/gnu
+@end table
+
+@item South America:
+@table @code
+@item ftp.inf.utfsm.cl:/pub/gnu
+@itemx ftp.unicamp.br:/pub/gnu
+@end table
+
+@item Western Canada:
+@table @code
+@item ftp.cs.ubc.ca:/mirror2/gnu
+@end table
+
+@item USA:
+@table @code
+@item col.hp.com:/mirrors/gnu
+@itemx f.ms.uky.edu:/pub3/gnu
+@itemx ftp.cc.gatech.edu:/pub/gnu
+@itemx ftp.cs.columbia.edu:/archives/gnu/prep
+@itemx ftp.digex.net:/pub/gnu
+@itemx ftp.hawaii.edu:/mirrors/gnu
+@itemx ftp.kpc.com:/pub/mirror/gnu
+@end table
+
+@iftex
+@page
+@end iftex
+@item USA (continued):
+@table @code
+@itemx ftp.uu.net:/systems/gnu
+@itemx gatekeeper.dec.com:/pub/GNU
+@itemx jaguar.utah.edu:/gnustuff
+@itemx labrea.stanford.edu
+@itemx mrcnext.cso.uiuc.edu:/pub/gnu
+@itemx vixen.cso.uiuc.edu:/gnu
+@itemx wuarchive.wustl.edu:/systems/gnu
+@end table
+@end table
+@end enumerate
+
+@node Extracting, Distribution contents, Getting, Gawk Distribution
+@appendixsubsec Extracting the Distribution
+@code{gawk} is distributed as a @code{tar} file compressed with the
+GNU Zip program, @code{gzip}.
+
+Once you have the distribution (for example,
+@file{gawk-@value{VERSION}.0.tar.gz}), first use @code{gzip} to expand the
+file, and then use @code{tar} to extract it.  You can use the following
+pipeline to produce the @code{gawk} distribution:
+
+@example
+# Under System V, add 'o' to the tar flags
+gzip -d -c gawk-@value{VERSION}.0.tar.gz | tar -xvpf -
+@end example
+
+@noindent
+This will create a directory named @file{gawk-@value{VERSION}.0} in the current
+directory.
+
+The distribution file name is of the form
+@file{gawk-@var{V}.@var{R}.@var{n}.tar.gz}.
+The @var{V} represents the major version of @code{gawk},
+the @var{R} represents the current release of version @var{V}, and
+the @var{n} represents a @dfn{patch level}, meaning that minor bugs have
+been fixed in the release.  The current patch level is 0, but when
+retrieving distributions, you should get the version with the highest
+version, release, and patch level.  (Note that release levels greater than
+or equal to 90 denote ``beta,'' or non-production software; you may not wish
+to retrieve such a version unless you don't mind experimenting.)
+
+If you are not on a Unix system, you will need to make other arrangements
+for getting and extracting the @code{gawk} distribution.  You should consult
+a local expert.
+
+@node Distribution contents,  , Extracting, Gawk Distribution
+@appendixsubsec Contents of the @code{gawk} Distribution
+
+The @code{gawk} distribution has a number of C source files,
+documentation files,
+subdirectories and files related to the configuration process
+(@pxref{Unix Installation, ,Compiling and Installing @code{gawk} on Unix}),
+and several subdirectories related to different, non-Unix,
+operating systems.
+
+@table @asis
+@item various @samp{.c}, @samp{.y}, and @samp{.h} files 
+These files are the actual @code{gawk} source code.
+@end table
+
+@iftex
+@page
+@end iftex
+@table @file
+@item README
+@itemx README_d/README.*
+Descriptive files: @file{README} for @code{gawk} under Unix, and the
+rest for the various hardware and software combinations.
+
+@item INSTALL
+A file providing an overview of the configuration and installation process.
+
+@item PORTS
+A list of systems to which @code{gawk} has been ported, and which
+have successfully run the test suite.
+
+@item ACKNOWLEDGMENT
+A list of the people who contributed major parts of the code or documentation.
+
+@item ChangeLog
+A detailed list of source code changes as bugs are fixed or improvements made.
+
+@item NEWS
+A list of changes to @code{gawk} since the last release or patch.
+
+@item COPYING
+The GNU General Public License.
+
+@item FUTURES
+A brief list of features and/or changes being contemplated for future
+releases, with some indication of the time frame for the feature, based
+on its difficulty.
+
+@item LIMITATIONS
+A list of those factors that limit @code{gawk}'s performance.
+Most of these depend on the hardware or operating system software, and
+are not limits in @code{gawk} itself.
+
+@item POSIX.STD
+A description of one area where the POSIX standard for @code{awk} is
+incorrect, and how @code{gawk} handles the problem.
+
+@item PROBLEMS
+A file describing known problems with the current release.
+
+@item doc/gawk.1
+The @code{troff} source for a manual page describing @code{gawk}.
+This is distributed for the convenience of Unix users.
+
+@item doc/gawk.texi
+The Texinfo source file for this @value{DOCUMENT}.
+It should be processed with @TeX{} to produce a printed document, and
+with @code{makeinfo} to produce an Info file.
+
+@item doc/gawk.info
+The generated Info file for this @value{DOCUMENT}.
+
+@item doc/igawk.1
+The @code{troff} source for a manual page describing the @code{igawk}
+program presented in
+@ref{Igawk Program, ,An Easy Way to Use Library Functions}.
+
+@item doc/Makefile.in
+The input file used during the configuration process to generate the
+actual @file{Makefile} for creating the documentation.
+
+@item Makefile.in
+@itemx acconfig.h
+@itemx aclocal.m4
+@itemx configh.in
+@itemx configure.in
+@itemx configure
+@itemx custom.h
+@itemx missing/*
+These files and subdirectory are used when configuring @code{gawk}
+for various Unix systems.  They are explained in detail in
+@ref{Unix Installation, ,Compiling and Installing @code{gawk} on Unix}.
+
+@item awklib/extract.awk
+@itemx awklib/Makefile.in
+The @file{awklib} directory contains a copy of @file{extract.awk}
+(@pxref{Extract Program, ,Extracting Programs from Texinfo Source Files}),
+which can be used to extract the sample programs from the Texinfo
+source file for this @value{DOCUMENT}, and a @file{Makefile.in} file, which
+@code{configure} uses to generate a @file{Makefile}.
+As part of the process of building @code{gawk}, the library functions from
+@ref{Library Functions, , A Library of @code{awk} Functions},
+and the @code{igawk} program from
+@ref{Igawk Program, , An Easy Way to Use Library Functions},
+are extracted into ready to use files.
+They are installed as part of the installation process.
+
+@item amiga/*
+Files needed for building @code{gawk} on an Amiga.
+@xref{Amiga Installation, ,Installing @code{gawk} on an Amiga}, for details.
+
+@item atari/*
+Files needed for building @code{gawk} on an Atari ST.
+@xref{Atari Installation, ,Installing @code{gawk} on the Atari ST}, for details.
+
+@item pc/*
+Files needed for building @code{gawk} under MS-DOS and OS/2.
+@xref{PC Installation, ,MS-DOS and OS/2 Installation and Compilation}, for details.
+
+@item vms/*
+Files needed for building @code{gawk} under VMS.
+@xref{VMS Installation, ,How to Compile and Install @code{gawk} on VMS}, for details.
+
+@item test/*
+A test suite for
+@code{gawk}.  You can use @samp{make check} from the top level @code{gawk}
+directory to run your version of @code{gawk} against the test suite.
+If @code{gawk} successfully passes @samp{make check} then you can
+be confident of a successful port.
+@end table
+
+@node Unix Installation, VMS Installation, Gawk Distribution, Installation
+@appendixsec Compiling and Installing @code{gawk} on Unix
+
+Usually, you can compile and install @code{gawk} by typing only two
+commands.  However, if you do use an unusual system, you may need
+to configure @code{gawk} for your system yourself.
+
+@menu
+* Quick Installation::          Compiling @code{gawk} under Unix.
+* Configuration Philosophy::    How it's all supposed to work.
+@end menu
+
+@node Quick Installation, Configuration Philosophy, Unix Installation, Unix Installation
+@appendixsubsec Compiling @code{gawk} for Unix
+
+@cindex installation, unix
+After you have extracted the @code{gawk} distribution, @code{cd}
+to @file{gawk-@value{VERSION}.0}.  Like most GNU software,
+@code{gawk} is configured
+automatically for your Unix system by running the @code{configure} program.
+This program is a Bourne shell script that was generated automatically using
+GNU @code{autoconf}.
+@iftex
+(The @code{autoconf} software is
+described fully in
+@cite{Autoconf---Generating Automatic Configuration Scripts},
+which is available from the Free Software Foundation.)
+@end iftex
+@ifinfo
+(The @code{autoconf} software is described fully starting with
+@ref{Top, , Introduction, autoconf, Autoconf---Generating Automatic Configuration Scripts}.)
+@end ifinfo
+
+To configure @code{gawk}, simply run @code{configure}:
+
+@example
+sh ./configure
+@end example
+
+This produces a @file{Makefile} and @file{config.h} tailored to your system.
+The @file{config.h} file describes various facts about your system.
+You may wish to edit the @file{Makefile} to
+change the @code{CFLAGS} variable, which controls
+the command line options that are passed to the C compiler (such as
+optimization levels, or compiling for debugging).
+
+Alternatively, you can add your own values for most @code{make}
+variables, such as @code{CC} and @code{CFLAGS}, on the command line when
+running @code{configure}:
+
+@example
+CC=cc CFLAGS=-g sh ./configure
+@end example
+
+@noindent
+See the file @file{INSTALL} in the @code{gawk} distribution for
+all the details.
+
+After you have run @code{configure}, and possibly edited the @file{Makefile},
+type:
+
+@example
+make
+@end example
+
+@noindent
+and shortly thereafter, you should have an executable version of @code{gawk}.
+That's all there is to it!
+(If these steps do not work, please send in a bug report;
+@pxref{Bugs, ,Reporting Problems and Bugs}.)
+
+@node Configuration Philosophy, , Quick Installation, Unix Installation
+@appendixsubsec The Configuration Process
+
+@cindex configuring @code{gawk}
+(This section is of interest only if you know something about using the
+C language and the Unix operating system.)
+
+The source code for @code{gawk} generally attempts to adhere to formal
+standards wherever possible.  This means that @code{gawk} uses library
+routines that are specified by the ANSI C standard and by the POSIX
+operating system interface standard.  When using an ANSI C compiler,
+function prototypes are used to help improve the compile-time checking.
+
+Many Unix systems do not support all of either the ANSI or the
+POSIX standards.  The @file{missing} subdirectory in the @code{gawk}
+distribution contains replacement versions of those subroutines that are
+most likely to be missing.
+
+The @file{config.h} file that is created by the @code{configure} program
+contains definitions that describe features of the particular operating
+system where you are attempting to compile @code{gawk}.  The three things
+described by this file are what header files are available, so that
+they can be correctly included,
+what (supposedly) standard functions are actually available in your C
+libraries, and
+other miscellaneous facts about your
+variant of Unix.  For example, there may not be an @code{st_blksize}
+element in the @code{stat} structure.  In this case @samp{HAVE_ST_BLKSIZE}
+would be undefined.
+
+@cindex @code{custom.h} configuration file
+It is possible for your C compiler to lie to @code{configure}. It may
+do so by not exiting with an error when a library function is not
+available.  To get around this, you can edit the file @file{custom.h}.
+Use an @samp{#ifdef} that is appropriate for your system, and either
+@code{#define} any constants that @code{configure} should have defined but
+didn't, or @code{#undef} any constants that @code{configure} defined and
+should not have.  @file{custom.h} is automatically included by
+@file{config.h}.
+
+It is also possible that the @code{configure} program generated by
+@code{autoconf}
+will not work on your system in some other fashion.  If you do have a problem,
+the file
+@file{configure.in} is the input for @code{autoconf}.  You may be able to
+change this file, and generate a new version of @code{configure} that will
+work on your system.  @xref{Bugs, ,Reporting Problems and Bugs}, for
+information on how to report problems in configuring @code{gawk}.  The same
+mechanism may be used to send in updates to @file{configure.in} and/or
+@file{custom.h}.
+
+@node VMS Installation, PC Installation, Unix Installation, Installation
+@appendixsec How to Compile and Install @code{gawk} on VMS
+
+@c based on material from Pat Rankin <rankin@eql.caltech.edu>
+
+@cindex installation, vms
+This section describes how to compile and install @code{gawk} under VMS.
+
+@menu
+* VMS Compilation::             How to compile @code{gawk} under VMS.
+* VMS Installation Details::    How to install @code{gawk} under VMS.
+* VMS Running::                 How to run @code{gawk} under VMS.
+* VMS POSIX::                   Alternate instructions for VMS POSIX.
+@end menu
+
+@node VMS Compilation, VMS Installation Details, VMS Installation, VMS Installation
+@appendixsubsec Compiling @code{gawk} on VMS
+
+To compile @code{gawk} under VMS, there is a @code{DCL} command procedure that
+will issue all the necessary @code{CC} and @code{LINK} commands, and there is
+also a @file{Makefile} for use with the @code{MMS} utility.  From the source
+directory, use either
+
+@example
+$ @@[.VMS]VMSBUILD.COM
+@end example
+
+@noindent
+or
+
+@example
+$ MMS/DESCRIPTION=[.VMS]DESCRIP.MMS GAWK
+@end example
+
+Depending upon which C compiler you are using, follow one of the sets
+of instructions in this table:
+
+@table @asis
+@item VAX C V3.x
+Use either @file{vmsbuild.com} or @file{descrip.mms} as is.  These use
+@code{CC/OPTIMIZE=NOLINE}, which is essential for Version 3.0.
+
+@item VAX C V2.x
+You must have Version 2.3 or 2.4; older ones won't work.  Edit either
+@file{vmsbuild.com} or @file{descrip.mms} according to the comments in them.
+For @file{vmsbuild.com}, this just entails removing two @samp{!} delimiters.
+Also edit @file{config.h} (which is a copy of file @file{[.config]vms-conf.h})
+and comment out or delete the two lines @samp{#define __STDC__ 0} and
+@samp{#define VAXC_BUILTINS} near the end.
+
+@item GNU C
+Edit @file{vmsbuild.com} or @file{descrip.mms}; the changes are different
+from those for VAX C V2.x, but equally straightforward.  No changes to
+@file{config.h} should be needed.
+
+@item DEC C
+Edit @file{vmsbuild.com} or @file{descrip.mms} according to their comments.
+No changes to @file{config.h} should be needed.
+@end table
+
+@code{gawk} has been tested under VAX/VMS 5.5-1 using VAX C V3.2,
+GNU C 1.40 and 2.3.  It should work without modifications for VMS V4.6 and up.
+
+@node VMS Installation Details, VMS Running, VMS Compilation, VMS Installation
+@appendixsubsec Installing @code{gawk} on VMS
+
+To install @code{gawk}, all you need is a ``foreign'' command, which is
+a @code{DCL} symbol whose value begins with a dollar sign. For example:
+
+@example
+$ GAWK :== $disk1:[gnubin]GAWK
+@end example
+
+@noindent
+(Substitute the actual location of @code{gawk.exe} for
+@samp{$disk1:[gnubin]}.) The symbol should be placed in the
+@file{login.com} of any user who wishes to run @code{gawk},
+so that it will be defined every time the user logs on.
+Alternatively, the symbol may be placed in the system-wide
+@file{sylogin.com} procedure, which will allow all users
+to run @code{gawk}.
+
+Optionally, the help entry can be loaded into a VMS help library:
+
+@example
+$ LIBRARY/HELP SYS$HELP:HELPLIB [.VMS]GAWK.HLP
+@end example
+
+@noindent
+(You may want to substitute a site-specific help library rather than
+the standard VMS library @samp{HELPLIB}.)  After loading the help text,
+
+@example
+$ HELP GAWK
+@end example
+
+@noindent
+will provide information about both the @code{gawk} implementation and the
+@code{awk} programming language.
+
+The logical name @samp{AWK_LIBRARY} can designate a default location
+for @code{awk} program files.  For the @samp{-f} option, if the specified
+filename has no device or directory path information in it, @code{gawk}
+will look in the current directory first, then in the directory specified
+by the translation of @samp{AWK_LIBRARY} if the file was not found.
+If after searching in both directories, the file still is not found,
+then @code{gawk} appends the suffix @samp{.awk} to the filename and the
+file search will be re-tried.  If @samp{AWK_LIBRARY} is not defined, that
+portion of the file search will fail benignly.
+
+@node VMS Running, VMS POSIX, VMS Installation Details, VMS Installation
+@appendixsubsec Running @code{gawk} on VMS
+
+Command line parsing and quoting conventions are significantly different
+on VMS, so examples in this @value{DOCUMENT} or from other sources often need minor
+changes.  They @emph{are} minor though, and all @code{awk} programs
+should run correctly.
+
+Here are a couple of trivial tests:
+
+@example
+$ gawk -- "BEGIN @{print ""Hello, World!""@}"
+$ gawk -"W" version
+! could also be -"W version" or "-W version"
+@end example
+
+@noindent
+Note that upper-case and mixed-case text must be quoted.
+
+The VMS port of @code{gawk} includes a @code{DCL}-style interface in addition
+to the original shell-style interface (see the help entry for details).
+One side-effect of dual command line parsing is that if there is only a
+single parameter (as in the quoted string program above), the command
+becomes ambiguous.  To work around this, the normally optional @samp{--}
+flag is required to force Unix style rather than @code{DCL} parsing.  If any
+other dash-type options (or multiple parameters such as data files to be
+processed) are present, there is no ambiguity and @samp{--} can be omitted.
+
+The default search path when looking for @code{awk} program files specified
+by the @samp{-f} option is @code{"SYS$DISK:[],AWK_LIBRARY:"}.  The logical
+name @samp{AWKPATH} can be used to override this default.  The format
+of @samp{AWKPATH} is a comma-separated list of directory specifications.
+When defining it, the value should be quoted so that it retains a single
+translation, and not a multi-translation @code{RMS} searchlist.
+
+@node VMS POSIX,  , VMS Running, VMS Installation
+@appendixsubsec Building and Using @code{gawk} on VMS POSIX
+
+Ignore the instructions above, although @file{vms/gawk.hlp} should still
+be made available in a help library.  Make sure that the @code{configure}
+script is executable; use @samp{chmod +x}
+on it if necessary.  Then execute the following commands:
+
+@example
+@group
+$ POSIX
+psx> CC=vms/posix-cc.sh configure
+psx> CC=c89 make gawk
+@end group
+@end example
+
+@noindent
+The first command will construct files @file{config.h} and @file{Makefile}
+out of templates.  The second command will compile and link @code{gawk}.
+@ignore
+Due to a @code{make} bug in VMS POSIX V1.0 and V1.1,
+the file @file{awktab.c} must be given as an explicit target or it will
+not be built and the final link step will fail.
+@end ignore
+Ignore the warning
+@code{"Could not find lib m in lib list"}; it is harmless, caused by the
+explicit use of @samp{-lm} as a linker option which is not needed
+under VMS POSIX.  Under V1.1 (but not V1.0) a problem with the @code{yacc}
+skeleton @file{/etc/yyparse.c} will cause a compiler warning for
+@file{awktab.c}, followed by a linker warning about compilation warnings
+in the resulting object module.  These warnings can be ignored.
+
+Once built, @code{gawk} will work like any other shell utility.  Unlike
+the normal VMS port of @code{gawk}, no special command line manipulation is
+needed in the VMS POSIX environment.
+
+@c Rewritten by Scott Deifik <scottd@amgen.com>
+@c and Darrel Hankerson <hankedr@mail.auburn.edu>
+@node PC Installation, Atari Installation, VMS Installation, Installation
+@appendixsec MS-DOS and OS/2 Installation and Compilation
+
+@cindex installation, MS-DOS and OS/2 
+If you have received a binary distribution prepared by the DOS
+maintainers, then @code{gawk} and the necessary support files will appear
+under the @file{gnu} directory, with executables in @file{gnu/bin},
+libraries in @file{gnu/lib/awk}, and manual pages under @file{gnu/man}.
+This is designed for easy installation to a @file{/gnu} directory on your
+drive, but the files can be installed anywhere provided @code{AWKPATH} is
+set properly.  Regardless of the installation directory, the first line of
+@file{igawk.cmd} and @file{igawk.bat} (in @file{gnu/bin}) may need to be
+edited.
+
+The binary distribution will contain a separate file describing the
+contents. In particular, it may include more than one version of the
+@code{gawk} executable. OS/2 binary distributions may have a 
+different arrangement, but installation is similar.
+
+The OS/2 and MS-DOS versions of @code{gawk} search for program files as
+described in @ref{AWKPATH Variable, ,The @code{AWKPATH} Environment Variable}.
+However, semicolons (rather than colons) separate elements
+in the @code{AWKPATH} variable. If @code{AWKPATH} is not set or is empty,
+then the default search path is @code{@w{".;c:/lib/awk;c:/gnu/lib/awk"}}.
+
+An @code{sh}-like shell (as opposed to @code{command.com} under MS-DOS 
+or @code{cmd.exe} under OS/2) may be useful for @code{awk} programming.
+Ian Stewartson has written an excellent shell for MS-DOS and OS/2, and a
+@code{ksh} clone and GNU Bash are available for OS/2. The file
+@file{README_d/README.pc} in the @code{gawk} distribution contains
+information on these shells. Users of Stewartson's shell on DOS should
+examine its documentation on handling of command-lines. In particular,
+the setting for @code{gawk} in the shell configuration may need to be
+changed, and the @code{ignoretype} option may also be of interest.
+
+@code{gawk} can be compiled for MS-DOS and OS/2 using the GNU development tools
+from DJ Delorie (DJGPP, MS-DOS-only) or Eberhard Mattes (EMX, MS-DOS and OS/2).
+Microsoft C can be used to build 16-bit versions for MS-DOS and OS/2.  The file
+@file{README_d/README.pc} in the @code{gawk} distribution contains additional
+notes, and @file{pc/Makefile} contains important notes on compilation options.
+
+To build @code{gawk}, copy the files in the @file{pc} directory to the
+directory with the rest of the @code{gawk} sources. The @file{Makefile}
+contains a configuration section with comments, and may need to be
+edited in order to work with your @code{make} utility.
+
+The @file{Makefile} contains a number of targets for building various MS-DOS
+and OS/2 versions. A list of targets will be printed if the @code{make}
+command is given without a target. As an example, to build @code{gawk}
+using the DJGPP tools, enter @samp{make djgpp}.
+
+Using @code{make} to run the standard tests and to install @code{gawk}
+requires additional Unix-like tools, including @code{sh}, @code{sed}, and
+@code{cp}. In order to run the tests, the @file{test/*.ok} files may need to
+be converted so that they have the usual DOS-style end-of-line markers. Most
+of the tests will work properly with Stewartson's shell along with the
+companion utilities or appropriate GNU utilities.  However, some editing of
+@file{test/Makefile} is required. It is recommended that the file
+@file{pc/Makefile.tst} be copied to @file{test/Makefile} as a
+replacement. Details can be found in @file{README_d/README.pc}.
+
+@node Atari Installation, Amiga Installation, PC Installation, Installation
+@appendixsec Installing @code{gawk} on the Atari ST
+
+@c based on material from Michal Jaegermann <michal@gortel.phys.ualberta.ca>
+
+@cindex atari
+@cindex installation, atari
+There are no substantial differences when installing @code{gawk} on
+various Atari models.  Compiled @code{gawk} executables do not require
+a large amount of memory with most @code{awk} programs and should run on all
+Motorola processor based models (called further ST, even if that is not
+exactly right).
+
+In order to use @code{gawk}, you need to have a shell, either text or
+graphics, that does not map all the characters of a command line to
+upper-case.  Maintaining case distinction in option flags is very
+important (@pxref{Options, ,Command Line Options}).
+These days this is the default, and it may only be a problem for some
+very old machines.  If your system does not preserve the case of option
+flags, you will need to upgrade your tools.  Support for I/O
+redirection is necessary to make it easy to import @code{awk} programs
+from other environments.  Pipes are nice to have, but not vital.
+
+@menu
+* Atari Compiling::           Compiling @code{gawk} on Atari
+* Atari Using::               Running @code{gawk} on Atari
+@end menu
+
+@node Atari Compiling, Atari Using, Atari Installation, Atari Installation
+@appendixsubsec Compiling @code{gawk} on the Atari ST
+
+A proper compilation of @code{gawk} sources when @code{sizeof(int)}
+differs from @code{sizeof(void *)} requires an ANSI C compiler. An initial
+port was done with @code{gcc}.  You may actually prefer executables
+where @code{int}s are four bytes wide, but the other variant works as well.
+
+You may need quite a bit of memory when trying to recompile the @code{gawk}
+sources, as some source files (@file{regex.c} in particular) are quite
+big.  If you run out of memory compiling such a file, try reducing the
+optimization level for this particular file; this may help.
+
+@cindex Linux
+With a reasonable shell (Bash will do), and in particular if you run
+Linux, MiNT or a similar operating system, you have a pretty good
+chance that the @code{configure} utility will succeed.  Otherwise
+sample versions of @file{config.h} and @file{Makefile.st} are given in the
+@file{atari} subdirectory and can be edited and copied to the
+corresponding files in the main source directory.  Even if
+@code{configure} produced something, it might be advisable to compare
+its results with the sample versions and possibly make adjustments.
+
+Some @code{gawk} source code fragments depend on a preprocessor define
+@samp{atarist}.  This basically assumes the TOS environment with @code{gcc}.
+Modify these sections as appropriate if they are not right for your
+environment.  Also see the remarks about @code{AWKPATH} and @code{envsep} in
+@ref{Atari Using, ,Running @code{gawk} on the Atari ST}.
+
+As shipped, the sample @file{config.h} claims that the @code{system}
+function is missing from the libraries, which is not true, and an
+alternative implementation of this function is provided in
+@file{atari/system.c}.  Depending upon your particular combination of
+shell and operating system, you may wish to change the file to indicate
+that @code{system} is available.
+
+@node Atari Using, , Atari Compiling, Atari Installation
+@appendixsubsec Running @code{gawk} on the Atari ST
+
+An executable version of @code{gawk} should be placed, as usual,
+anywhere in your @code{PATH} where your shell can find it.
+
+While executing, @code{gawk} creates a number of temporary files.  When
+using @code{gcc} libraries for TOS, @code{gawk} looks for either of
+the environment variables @code{TEMP} or @code{TMPDIR}, in that order.
+If either one is found, its value is assumed to be a directory for
+temporary files.  This directory must exist, and if you can spare the
+memory, it is a good idea to put it on a RAM drive.  If neither
+@code{TEMP} nor @code{TMPDIR} are found, then @code{gawk} uses the
+current directory for its temporary files.
+
+The ST version of @code{gawk} searches for its program files as described in
+@ref{AWKPATH Variable, ,The @code{AWKPATH} Environment Variable}.
+The default value for the @code{AWKPATH} variable is taken from
+@code{DEFPATH} defined in @file{Makefile}. The sample @code{gcc}/TOS
+@file{Makefile} for the ST in the distribution sets @code{DEFPATH} to
+@code{@w{".,c:\lib\awk,c:\gnu\lib\awk"}}.  The search path can be
+modified by explicitly setting @code{AWKPATH} to whatever you wish.
+Note that colons cannot be used on the ST to separate elements in the
+@code{AWKPATH} variable, since they have another, reserved, meaning.
+Instead, you must use a comma to separate elements in the path.  When
+recompiling, the separating character can be modified by initializing
+the @code{envsep} variable in @file{atari/gawkmisc.atr} to another
+value.
+
+Although @code{awk} allows great flexibility in doing I/O redirections
+from within a program, this facility should be used with care on the ST
+running under TOS.  In some circumstances the OS routines for file
+handle pool processing lose track of certain events, causing the
+computer to crash, and requiring a reboot.  Often a warm reboot is
+sufficient.  Fortunately, this happens infrequently, and in rather
+esoteric situations.  In particular, avoid having one part of an
+@code{awk} program using @code{print} statements explicitly redirected
+to @code{"/dev/stdout"}, while other @code{print} statements use the
+default standard output, and a calling shell has redirected standard
+output to a file.
+
+When @code{gawk} is compiled with the ST version of @code{gcc} and its
+usual libraries, it will accept both @samp{/} and @samp{\} as path separators.
+While this is convenient, it should be remembered that this removes one,
+technically valid, character (@samp{/}) from your file names, and that
+it may create problems for external programs, called via the @code{system}
+function, which may not support this convention.  Whenever it is possible
+that a file created by @code{gawk} will be used by some other program,
+use only backslashes.  Also remember that in @code{awk}, backslashes in
+strings have to be doubled in order to get literal backslashes
+(@pxref{Escape Sequences}).
+
+@node Amiga Installation, Bugs, Atari Installation, Installation
+@appendixsec Installing @code{gawk} on an Amiga
+
+@cindex amiga
+@cindex installation, amiga
+You can install @code{gawk} on an Amiga system using a Unix emulation
+environment available via anonymous @code{ftp} from
+@code{wuarchive.wustl.edu} in the directory @file{pub/aminet/dev/gcc}.
+This includes a shell based on @code{pdksh}.  The primary component of
+this environment is a Unix emulation library, @file{ixemul.lib}.
+@c could really use more background here, who wrote this, etc.
+
+A more complete distribution for the Amiga is available on 
+the FreshFish CD-ROM from:
+
+@quotation
+Amiga Library Services @*
+610 North Alma School Road, Suite 18 @*
+Chandler, AZ  85224  USA @*
+Phone: +1-602-491-0048 @*
+FAX: +1-602-491-0048 @*
+E-mail:	@code{orders@@amigalib.com}
+@end quotation
+
+Once you have the distribution, you can configure @code{gawk} simply by
+running @code{configure}:
+
+@example
+configure -v m68k-cbm-amigados
+@end example
+
+Then run @code{make}, and you should be all set!
+(If these steps do not work, please send in a bug report;
+@pxref{Bugs, ,Reporting Problems and Bugs}.)
+
+@node Bugs, Other Versions, Amiga Installation, Installation
+@appendixsec Reporting Problems and Bugs
+
+If you have problems with @code{gawk} or think that you have found a bug,
+please report it to the developers; we cannot promise to do anything
+but we might well want to fix it.
+
+Before reporting a bug, make sure you have actually found a real bug.
+Carefully reread the documentation and see if it really says you can do
+what you're trying to do.  If it's not clear whether you should be able
+to do something or not, report that too; it's a bug in the documentation!
+
+Before reporting a bug or trying to fix it yourself, try to isolate it
+to the smallest possible @code{awk} program and input data file that
+reproduces the problem.  Then send us the program and data file,
+some idea of what kind of Unix system you're using, and the exact results
+@code{gawk} gave you.  Also say what you expected to occur; this will help
+us decide whether the problem was really in the documentation.
+
+Once you have a precise problem, there are two e-mail addresses you
+can send mail to.
+
+@table @asis
+@item Internet:
+@samp{bug-gnu-utils@@prep.ai.mit.edu}
+
+@item UUCP:
+@samp{uunet!prep.ai.mit.edu!bug-gnu-utils}
+@end table
+
+Please include the
+version number of @code{gawk} you are using.  You can get this information
+with the command @samp{gawk --version}.
+You should send a carbon copy of your mail to Arnold Robbins, who can
+be reached at @samp{arnold@@gnu.ai.mit.edu}.
+
+@cindex @code{comp.lang.awk}
+@strong{Important!} Do @emph{not} try to report bugs in @code{gawk} by
+posting to the Usenet/Internet newsgroup @code{comp.lang.awk}.
+While the @code{gawk} developers do occasionally read this newsgroup,
+there is no guarantee that we will see your posting.  The steps described
+above are the official, recognized ways for reporting bugs.
+
+Non-bug suggestions are always welcome as well.  If you have questions
+about things that are unclear in the documentation or are just obscure
+features, ask Arnold Robbins; he will try to help you out, although he
+may not have the time to fix the problem.  You can send him electronic
+mail at the Internet address above.
+
+If you find bugs in one of the non-Unix ports of @code{gawk}, please send
+an electronic mail message to the person who maintains that port.  They
+are listed below, and also in the @file{README} file in the @code{gawk}
+distribution.  Information in the @code{README} file should be considered
+authoritative if it conflicts with this @value{DOCUMENT}.
+
+The people maintaining the non-Unix ports of @code{gawk} are:
+
+@cindex Deifik, Scott
+@cindex Fish, Fred
+@cindex Hankerson, Darrel
+@cindex Jaegermann, Michal
+@cindex Rankin, Pat
+@cindex Rommel, Kai Uwe
+@table @asis
+@item MS-DOS
+Scott Deifik, @samp{scottd@@amgen.com}, and
+Darrel Hankerson, @samp{hankedr@@mail.auburn.edu}.
+
+@item OS/2
+Kai Uwe Rommel, @samp{rommel@@ars.de}.
+
+@item VMS
+Pat Rankin, @samp{rankin@@eql.caltech.edu}.
+
+@item Atari ST
+Michal Jaegermann, @samp{michal@@gortel.phys.ualberta.ca}.
+
+@item Amiga
+Fred Fish, @samp{fnf@@amigalib.com}.
+@end table
+
+If your bug is also reproducible under Unix, please send copies of your
+report to the general GNU bug list, as well as to Arnold Robbins, at the
+addresses listed above.
+
+@node Other Versions, , Bugs, Installation
+@appendixsec Other Freely Available @code{awk} Implementations
+
+There are two other freely available @code{awk} implementations.
+This section briefly describes where to get them.
+
+@table @asis
+@cindex Kernighan, Brian
+@cindex anonymous @code{ftp}
+@cindex @code{ftp}, anonymous
+@item Unix @code{awk}
+Brian Kernighan has been able to make his implementation of
+@code{awk} freely available.  You can get it via anonymous @code{ftp}
+to the host @code{@w{netlib.att.com}}.  Change directory to
+@file{/netlib/research}. Use ``binary'' or ``image'' mode, and
+retrieve @file{awk.bundle.Z}.
+
+This is a shell archive that has been compressed with the @code{compress}
+utility. It can be uncompressed with either @code{uncompress} or the
+GNU @code{gunzip} utility.
+
+This version requires an ANSI C compiler; GCC (the GNU C compiler)
+works quite nicely.
+
+@cindex Brennan, Michael
+@cindex @code{mawk}
+@item @code{mawk}
+Michael Brennan has written an independent implementation of @code{awk},
+called @code{mawk}.  It is available under the GPL
+(@pxref{Copying, ,GNU GENERAL PUBLIC LICENSE}),
+just as @code{gawk} is.
+
+You can get it via anonymous @code{ftp} to the host
+@code{@w{oxy.edu}}.  Change directory to @file{/public}. Use ``binary''
+or ``image'' mode, and retrieve @file{mawk1.2.1.tar.gz} (or the latest
+version that is there).
+
+@code{gunzip} may be used to decompress this file. Installation
+is similar to @code{gawk}'s
+(@pxref{Unix Installation, , Compiling and Installing @code{gawk} on Unix}).
+@end table
+
+@node Notes, Glossary, Installation, Top
+@appendix Implementation Notes
+
+This appendix contains information mainly of interest to implementors and
+maintainers of @code{gawk}.  Everything in it applies specifically to
+@code{gawk}, and not to other implementations.
+
+@menu
+* Compatibility Mode::          How to disable certain @code{gawk} extensions.
+* Additions::                   Making Additions To @code{gawk}.
+* Future Extensions::           New features that may be implemented one day.
+* Improvements::                Suggestions for improvements by volunteers.
+@end menu
+
+@node Compatibility Mode, Additions, Notes, Notes
+@appendixsec Downward Compatibility and Debugging
+
+@xref{POSIX/GNU, ,Extensions in @code{gawk} Not in POSIX @code{awk}},
+for a summary of the GNU extensions to the @code{awk} language and program.
+All of these features can be turned off by invoking @code{gawk} with the
+@samp{--traditional} option, or with the @samp{--posix} option.
+
+If @code{gawk} is compiled for debugging with @samp{-DDEBUG}, then there
+is one more option available on the command line:
+
+@table @code
+@item -W parsedebug
+@itemx --parsedebug
+Print out the parse stack information as the program is being parsed.
+@end table
+
+This option is intended only for serious @code{gawk} developers,
+and not for the casual user.  It probably has not even been compiled into
+your version of @code{gawk}, since it slows down execution.
+
+@node Additions, Future Extensions, Compatibility Mode, Notes
+@appendixsec Making Additions to @code{gawk}
+
+If you should find that you wish to enhance @code{gawk} in a significant
+fashion, you are perfectly free to do so.  That is the point of having
+free software; the source code is available, and you are free to change
+it as you wish (@pxref{Copying, ,GNU GENERAL PUBLIC LICENSE}).
+
+This section discusses the ways you might wish to change @code{gawk},
+and any considerations you should bear in mind.
+
+@menu
+* Adding Code::             Adding code to the main body of @code{gawk}.
+* New Ports::               Porting @code{gawk} to a new operating system.
+@end menu
+
+@node Adding Code, New Ports, Additions, Additions
+@appendixsubsec Adding New Features
+
+@cindex adding new features
+@cindex features, adding
+You are free to add any new features you like to @code{gawk}.
+However, if you want your changes to be incorporated into the @code{gawk}
+distribution, there are several steps that you need to take in order to
+make it possible for me to include to your changes.
+
+@enumerate 1
+@item
+Get the latest version.
+It is much easier for me to integrate changes if they are relative to
+the most recent distributed version of @code{gawk}.  If your version of
+@code{gawk} is very old, I may not be able to integrate them at all.
+@xref{Getting, ,Getting the @code{gawk} Distribution},
+for information on getting the latest version of @code{gawk}.
+
+@item
+@iftex
+Follow the @cite{GNU Coding Standards}.
+@end iftex
+@ifinfo
+See @inforef{Top, , Version, standards, GNU Coding Standards}.
+@end ifinfo
+This document describes how GNU software should be written. If you haven't
+read it, please do so, preferably @emph{before} starting to modify @code{gawk}.
+(The @cite{GNU Coding Standards} are available as part of the Autoconf
+distribution, from the FSF.)
+
+@cindex @code{gawk} coding style
+@cindex coding style used in @code{gawk}
+@item
+Use the @code{gawk} coding style.
+The C code for @code{gawk} follows the instructions in the
+@cite{GNU Coding Standards}, with minor exceptions.  The code is formatted
+using the traditional ``K&R'' style, particularly as regards the placement
+of braces and the use of tabs.  In brief, the coding rules for @code{gawk}
+are:
+
+@itemize @bullet
+@item
+Use old style (non-prototype) function headers when defining functions.
+
+@item
+Put the name of the function at the beginning of its own line.
+
+@item
+Put the return type of the function, even if it is @code{int}, on the
+line above the line with the name and arguments of the function.
+
+@item
+The declarations for the function arguments should not be indented.
+
+@item
+Put spaces around parentheses used in control structures
+(@code{if}, @code{while}, @code{for}, @code{do}, @code{switch}
+and @code{return}).
+
+@item
+Do not put spaces in front of parentheses used in function calls.
+
+@item
+Put spaces around all C operators, and after commas in function calls.
+
+@item
+Do not use the comma operator to produce multiple side-effects, except
+in @code{for} loop initialization and increment parts, and in macro bodies.
+
+@item
+Use real tabs for indenting, not spaces.
+
+@item
+Use the ``K&R'' brace layout style.
+
+@item
+Use comparisons against @code{NULL} and @code{'\0'} in the conditions of
+@code{if}, @code{while} and @code{for} statements, and in the @code{case}s
+of @code{switch} statements, instead of just the
+plain pointer or character value.
+
+@item
+Use the @code{TRUE}, @code{FALSE}, and @code{NULL} symbolic constants,
+and the character constant @code{'\0'} where appropriate, instead of @code{1}
+and @code{0}.
+
+@item
+Provide one-line descriptive comments for each function.
+
+@item
+Do not use @samp{#elif}. Many older Unix C compilers cannot handle it.
+@end itemize
+
+If I have to reformat your code to follow the coding style used in
+@code{gawk}, I may not bother.
+
+@item
+Be prepared to sign the appropriate paperwork.
+In order for the FSF to distribute your changes, you must either place
+those changes in the public domain, and submit a signed statement to that
+effect, or assign the copyright in your changes to the FSF.
+Both of these actions are easy to do, and @emph{many} people have done so
+already. If you have questions, please contact me
+(@pxref{Bugs, , Reporting Problems and Bugs}),
+or @code{gnu@@prep.ai.mit.edu}.
+
+@item
+Update the documentation.
+Along with your new code, please supply new sections and or chapters
+for this @value{DOCUMENT}.  If at all possible, please use real
+Texinfo, instead of just supplying unformatted ASCII text (although
+even that is better than no documentation at all).
+Conventions to be followed in @cite{@value{TITLE}} are provided
+after the @samp{@@bye} at the end of the Texinfo source file. 
+If possible, please update the man page as well.
+
+You will also have to sign paperwork for your documentation changes.
+
+@item
+Submit changes as context diffs or unified diffs.
+Use @samp{diff -c -r -N} or @samp{diff -u -r -N} to compare
+the original @code{gawk} source tree with your version.
+(I find context diffs to be more readable, but unified diffs are
+more compact.)
+I recommend using the GNU version of @code{diff}.
+Send the output produced by either run of @code{diff} to me when you
+submit your changes.
+@xref{Bugs, , Reporting Problems and Bugs}, for the electronic mail
+information.
+
+Using this format makes it easy for me to apply your changes to the
+master version of the @code{gawk} source code (using @code{patch}).
+If I have to apply the changes manually, using a text editor, I may
+not do so, particularly if there are lots of changes.
+@end enumerate
+
+Although this sounds like a lot of work, please remember that while you
+may write the new code, I have to maintain it and support it, and if it
+isn't possible for me to do that with a minimum of extra work, then I
+probably will not.
+
+@node New Ports, , Adding Code, Additions
+@appendixsubsec Porting @code{gawk} to a New Operating System
+
+@cindex porting @code{gawk}
+If you wish to port @code{gawk} to a new operating system, there are
+several steps to follow.
+
+@enumerate 1
+@item
+Follow the guidelines in
+@ref{Adding Code, ,Adding New Features},
+concerning coding style, submission of diffs, and so on.
+
+@item
+When doing a port, bear in mind that your code must co-exist peacefully
+with the rest of @code{gawk}, and the other ports. Avoid gratuitous
+changes to the system-independent parts of the code. If at all possible,
+avoid sprinkling @samp{#ifdef}s just for your port throughout the
+code.
+
+If the changes needed for a particular system affect too much of the
+code, I probably will not accept them.  In such a case, you will, of course,
+be able to distribute your changes on your own, as long as you comply
+with the GPL
+(@pxref{Copying, ,GNU GENERAL PUBLIC LICENSE}).
+
+@item
+A number of the files that come with @code{gawk} are maintained by other
+people at the Free Software Foundation.  Thus, you should not change them
+unless it is for a very good reason. I.e.@: changes are not out of the
+question, but changes to these files will be scrutinized extra carefully.
+The files are @file{alloca.c}, @file{getopt.h}, @file{getopt.c},
+@file{getopt1.c}, @file{regex.h}, @file{regex.c}, @file{dfa.h},
+@file{dfa.c}, @file{install-sh}, and @file{mkinstalldirs}.
+
+@item
+Be willing to continue to maintain the port.
+Non-Unix operating systems are supported by volunteers who maintain
+the code needed to compile and run @code{gawk} on their systems. If no-one
+volunteers to maintain a port, that port becomes unsupported, and it may
+be necessary to remove it from the distribution.
+
+@item
+Supply an appropriate @file{gawkmisc.???} file.
+Each port has its own @file{gawkmisc.???} that implements certain
+operating system specific functions. This is cleaner than a plethora of
+@samp{#ifdef}s scattered throughout the code.  The @file{gawkmisc.c} in
+the main source directory includes the appropriate
+@file{gawkmisc.???} file from each subdirectory.
+Be sure to update it as well.
+
+Each port's @file{gawkmisc.???} file has a suffix reminiscent of the machine
+or operating system for the port. For example, @file{pc/gawkmisc.pc} and
+@file{vms/gawkmisc.vms}. The use of separate suffixes, instead of plain
+@file{gawkmisc.c}, makes it possible to move files from a port's subdirectory
+into the main subdirectory, without accidentally destroying the real
+@file{gawkmisc.c} file.  (Currently, this is only an issue for the MS-DOS
+and OS/2 ports.)
+
+@item
+Supply a @file{Makefile} and any other C source and header files that are
+necessary for your operating system.  All your code should be in a
+separate subdirectory, with a name that is the same as, or reminiscent
+of, either your operating system or the computer system.  If possible,
+try to structure things so that it is not necessary to move files out
+of the subdirectory into the main source directory.  If that is not
+possible, then be sure to avoid using names for your files that
+duplicate the names of files in the main source directory.
+
+@item
+Update the documentation.
+Please write a section (or sections) for this @value{DOCUMENT} describing the
+installation and compilation steps needed to install and/or compile
+@code{gawk} for your system.
+
+@item
+Be prepared to sign the appropriate paperwork.
+In order for the FSF to distribute your code, you must either place
+your code in the public domain, and submit a signed statement to that
+effect, or assign the copyright in your code to the FSF.
+@ifinfo
+Both of these actions are easy to do, and @emph{many} people have done so
+already. If you have questions, please contact me, or
+@code{gnu@@prep.ai.mit.edu}.
+@end ifinfo
+@end enumerate
+
+Following these steps will make it much easier to integrate your changes
+into @code{gawk}, and have them co-exist happily with the code for other
+operating systems that is already there.
+
+In the code that you supply, and that you maintain, feel free to use a
+coding style and brace layout that suits your taste.
+
+@c why should this be needed? sigh
+@iftex
+@page
+@end iftex
+@node Future Extensions, Improvements, Additions, Notes
+@appendixsec Probable Future Extensions
+
+@ignore
+From emory!scalpel.netlabs.com!lwall Tue Oct 31 12:43:17 1995
+Return-Path: <emory!scalpel.netlabs.com!lwall>
+Message-Id: <9510311732.AA28472@scalpel.netlabs.com>
+To: arnold@skeeve.atl.ga.us (Arnold D. Robbins)
+Subject: Re: May I quote you? 
+In-Reply-To: Your message of "Tue, 31 Oct 95 09:11:00 EST."
+             <m0tAHPQ-00014MC@skeeve.atl.ga.us> 
+Date: Tue, 31 Oct 95 09:32:46 -0800
+From: Larry Wall <emory!scalpel.netlabs.com!lwall>
+
+: Greetings. I am working on the release of gawk 3.0. Part of it will be a
+: thoroughly updated manual. One of the sections deals with planned future
+: extensions and enhancements.  I have the following at the beginning
+: of it:
+: 
+: @cindex PERL
+: @cindex Wall, Larry
+: @display
+: @i{AWK is a language similar to PERL, only considerably more elegant.} @*
+: Arnold Robbins
+: @sp 1
+: @i{Hey!} @*
+: Larry Wall
+: @end display
+: 
+: Before I actually release this for publication, I wanted to get your
+: permission to quote you.  (Hopefully, in the spirit of much of GNU, the
+: implied humor is visible... :-)
+
+I think that would be fine.
+
+Larry
+@end ignore
+
+@cindex PERL
+@cindex Wall, Larry
+@display
+@i{AWK is a language similar to PERL, only considerably more elegant.}
+Arnold Robbins
+
+@i{Hey!}
+Larry Wall
+@end display
+
+This section briefly lists extensions and possible improvements
+that indicate the directions we are
+currently considering for @code{gawk}.  The file @file{FUTURES} in the
+@code{gawk} distributions lists these extensions as well.
+
+This is a list of probable future changes that will be usable by the
+@code{awk} language programmer.
+
+@c these are ordered by likelihood
+@table @asis
+@item Localization
+The GNU project is starting to support multiple languages.
+It will at least be possible to make @code{gawk} print its warnings and
+error messages in languages other than English.
+It may be possible for @code{awk} programs to also use the multiple
+language facilities, separate from @code{gawk} itself.
+
+@item Databases
+It may be possible to map a GDBM/NDBM/SDBM file into an @code{awk} array.
+
+@item A @code{PROCINFO} Array
+The special files that provide process-related information
+(@pxref{Special Files, ,Special File Names in @code{gawk}})
+may be superseded by a @code{PROCINFO} array that would provide the same
+information, in an easier to access fashion.
+
+@item More @code{lint} warnings
+There are more things that could be checked for portability.
+
+@item Control of subprocess environment
+Changes made in @code{gawk} to the array @code{ENVIRON} may be
+propagated to subprocesses run by @code{gawk}.
+
+@ignore
+@item @code{RECLEN} variable for fixed length records
+Along with @code{FIELDWIDTHS}, this would speed up the processing of
+fixed-length records.
+
+@item A @code{restart} keyword
+After modifying @code{$0}, @code{restart} would restart the pattern
+matching loop, without reading a new record from the input.
+
+@item A @samp{|&} redirection
+The @samp{|&} redirection, in place of @samp{|}, would open a two-way
+pipeline for communication with a sub-process (via @code{getline} and
+@code{print} and @code{printf}).
+
+@item Function valued variables
+It would be possible to assign the name of a user-defined or built-in
+function to a regular @code{awk} variable, and then call the function
+indirectly, by using the regular variable.  This would make it possible
+to write general purpose sorting and comparing routines, for example,
+by simply passing the name of one function into another.
+
+@item A built-in @code{stat} function
+The @code{stat} function would provide an easy-to-use hook to the
+@code{stat} system call so that @code{awk} programs could determine information
+about files.
+
+@item A built-in @code{ftw} function
+Combined with function valued variables and the @code{stat} function,
+@code{ftw} (file tree walk) would make it easy for an @code{awk} program
+to walk an entire file tree.
+@end ignore
+@end table
+
+This is a list of probable improvements that will make @code{gawk}
+perform better.
+
+@table @asis
+@item An Improved Version of @code{dfa}
+The @code{dfa} pattern matcher from GNU @code{grep} has some
+problems. Either a new version or a fixed one will deal with some
+important regexp matching issues.
+
+@item Use of @code{mmap}
+On systems that support the @code{mmap} system call, its use would provide
+much faster file input, and considerably simplified input buffer management.
+
+@item Use of GNU @code{malloc}
+The GNU version of @code{malloc} could potentially speed up @code{gawk},
+since it relies heavily on the use of dynamic memory allocation.
+
+@item Use of the @code{rx} regexp library
+The @code{rx} regular expression library could potentially speed up
+all regexp operations that require knowing the exact location of matches.
+This includes record termination, field and array splitting,
+and the @code{sub}, @code{gsub}, @code{gensub} and @code{match} functions.
+@end table
+
+@node Improvements,  , Future Extensions, Notes
+@appendixsec Suggestions for Improvements
+
+Here are some projects that would-be @code{gawk} hackers might like to take
+on.  They vary in size from a few days to a few weeks of programming,
+depending on which one you choose and how fast a programmer you are.  Please
+send any improvements you write to the maintainers at the GNU project.
+@xref{Adding Code, , Adding New Features},
+for guidelines to follow when adding new features to @code{gawk}.
+@xref{Bugs, ,Reporting Problems and Bugs}, for information on
+contacting the maintainers.
+
+@enumerate
+@item
+Compilation of @code{awk} programs: @code{gawk} uses a Bison (YACC-like)
+parser to convert the script given it into a syntax tree; the syntax
+tree is then executed by a simple recursive evaluator.  This method incurs
+a lot of overhead, since the recursive evaluator performs many procedure
+calls to do even the simplest things.
+
+It should be possible for @code{gawk} to convert the script's parse tree
+into a C program which the user would then compile, using the normal
+C compiler and a special @code{gawk} library to provide all the needed
+functions (regexps, fields, associative arrays, type coercion, and so
+on).
+
+An easier possibility might be for an intermediate phase of @code{awk} to
+convert the parse tree into a linear byte code form like the one used
+in GNU Emacs Lisp.  The recursive evaluator would then be replaced by
+a straight line byte code interpreter that would be intermediate in speed
+between running a compiled program and doing what @code{gawk} does
+now.
+
+@item
+The programs in the test suite could use documenting in this @value{DOCUMENT}.
+
+@item
+See the @file{FUTURES} file for more ideas.  Contact us if you would
+seriously like to tackle any of the items listed there.
+@end enumerate
+
+@node Glossary, Copying, Notes, Top
+@appendix Glossary
+
+@table @asis
+@item Action
+A series of @code{awk} statements attached to a rule.  If the rule's
+pattern matches an input record, @code{awk} executes the
+rule's action.  Actions are always enclosed in curly braces.
+@xref{Action Overview, ,Overview of Actions}.
+
+@item Amazing @code{awk} Assembler
+Henry Spencer at the University of Toronto wrote a retargetable assembler
+completely as @code{awk} scripts.  It is thousands of lines long, including
+machine descriptions for several eight-bit microcomputers.
+It is a good example of a
+program that would have been better written in another language.
+
+@item Amazingly Workable Formatter (@code{awf})
+Henry Spencer at the University of Toronto wrote a formatter that accepts
+a large subset of the @samp{nroff -ms} and @samp{nroff -man} formatting
+commands, using @code{awk} and @code{sh}.
+
+@item ANSI
+The American National Standards Institute.  This organization produces
+many standards, among them the standards for the C and C++ programming
+languages.
+
+@item Assignment
+An @code{awk} expression that changes the value of some @code{awk}
+variable or data object.  An object that you can assign to is called an
+@dfn{lvalue}.  The assigned values are called @dfn{rvalues}.
+@xref{Assignment Ops, ,Assignment Expressions}.
+
+@item @code{awk} Language
+The language in which @code{awk} programs are written.
+
+@item @code{awk} Program
+An @code{awk} program consists of a series of @dfn{patterns} and
+@dfn{actions}, collectively known as @dfn{rules}.  For each input record
+given to the program, the program's rules are all processed in turn.
+@code{awk} programs may also contain function definitions.
+
+@item @code{awk} Script
+Another name for an @code{awk} program.
+
+@item Bash
+The GNU version of the standard shell (the Bourne-Again shell).
+See ``Bourne Shell.''
+
+@item BBS
+See ``Bulletin Board System.''
+
+@item Boolean Expression
+Named after the English mathematician Boole. See ``Logical Expression.''
+
+@item Bourne Shell
+The standard shell (@file{/bin/sh}) on Unix and Unix-like systems,
+originally written by Steven R.@: Bourne.
+Many shells (Bash, @code{ksh}, @code{pdksh}, @code{zsh}) are
+generally upwardly compatible with the Bourne shell.
+
+@item Built-in Function
+The @code{awk} language provides built-in functions that perform various
+numerical, time stamp related, and string computations.  Examples are
+@code{sqrt} (for the square root of a number) and @code{substr} (for a
+substring of a string).  @xref{Built-in, ,Built-in Functions}.
+
+@item Built-in Variable
+@code{ARGC}, @code{ARGIND}, @code{ARGV}, @code{CONVFMT}, @code{ENVIRON},
+@code{ERRNO}, @code{FIELDWIDTHS}, @code{FILENAME}, @code{FNR}, @code{FS},
+@code{IGNORECASE}, @code{NF}, @code{NR}, @code{OFMT}, @code{OFS}, @code{ORS},
+@code{RLENGTH}, @code{RSTART}, @code{RS}, @code{RT}, and @code{SUBSEP},
+are the variables that have special meaning to @code{awk}.
+Changing some of them affects @code{awk}'s running environment.
+Several of these variables are specific to @code{gawk}.
+@xref{Built-in Variables}.
+
+@item Braces
+See ``Curly Braces.''
+
+@item Bulletin Board System
+A computer system allowing users to log in and read and/or leave messages
+for other users of the system, much like leaving paper notes on a bulletin
+board.
+
+@item C
+The system programming language that most GNU software is written in.  The
+@code{awk} programming language has C-like syntax, and this @value{DOCUMENT}
+points out similarities between @code{awk} and C when appropriate.
+
+@cindex ISO 8859-1
+@cindex ISO Latin-1
+@item Character Set
+The set of numeric codes used by a computer system to represent the
+characters (letters, numbers, punctuation, etc.) of a particular country
+or place. The most common character set in use today is ASCII (American
+Standard Code for Information Interchange).  Many European
+countries use an extension of ASCII known as ISO-8859-1 (ISO Latin-1).
+
+@item CHEM
+A preprocessor for @code{pic} that reads descriptions of molecules
+and produces @code{pic} input for drawing them.  It was written in @code{awk}
+by Brian Kernighan and Jon Bentley, and is available from
+@code{@w{netlib@@research.att.com}}.
+
+@item Compound Statement
+A series of @code{awk} statements, enclosed in curly braces.  Compound
+statements may be nested.
+@xref{Statements, ,Control Statements in Actions}.
+
+@item Concatenation
+Concatenating two strings means sticking them together, one after another,
+giving a new string.  For example, the string @samp{foo} concatenated with
+the string @samp{bar} gives the string @samp{foobar}.
+@xref{Concatenation, ,String Concatenation}.
+
+@item Conditional Expression
+An expression using the @samp{?:} ternary operator, such as
+@samp{@var{expr1} ? @var{expr2} : @var{expr3}}.  The expression
+@var{expr1} is evaluated; if the result is true, the value of the whole
+expression is the value of @var{expr2}, otherwise the value is
+@var{expr3}.  In either case, only one of @var{expr2} and @var{expr3}
+is evaluated.  @xref{Conditional Exp, ,Conditional Expressions}.
+
+@item Comparison Expression
+A relation that is either true or false, such as @samp{(a < b)}.
+Comparison expressions are used in @code{if}, @code{while}, @code{do},
+and @code{for}
+statements, and in patterns to select which input records to process.
+@xref{Typing and Comparison, ,Variable Typing and Comparison Expressions}.
+
+@item Curly Braces
+The characters @samp{@{} and @samp{@}}.  Curly braces are used in
+@code{awk} for delimiting actions, compound statements, and function
+bodies.
+
+@item Dark Corner
+An area in the language where specifications often were (or still
+are) not clear, leading to unexpected or undesirable behavior.
+Such areas are marked in this @value{DOCUMENT} with ``(d.c.)'' in the
+text, and are indexed under the heading ``dark corner.''
+
+@item Data Objects
+These are numbers and strings of characters.  Numbers are converted into
+strings and vice versa, as needed.
+@xref{Conversion, ,Conversion of Strings and Numbers}.
+
+@item Double Precision
+An internal representation of numbers that can have fractional parts.
+Double precision numbers keep track of more digits than do single precision
+numbers, but operations on them are more expensive.  This is the way
+@code{awk} stores numeric values.  It is the C type @code{double}.
+
+@item Dynamic Regular Expression
+A dynamic regular expression is a regular expression written as an
+ordinary expression.  It could be a string constant, such as
+@code{"foo"}, but it may also be an expression whose value can vary.
+@xref{Computed Regexps, , Using Dynamic Regexps}.
+
+@item Environment
+A collection of strings, of the form @var{name@code{=}val}, that each
+program has available to it. Users generally place values into the
+environment in order to provide information to various programs. Typical
+examples are the environment variables @code{HOME} and @code{PATH}.
+
+@item Empty String
+See ``Null String.''
+
+@item Escape Sequences
+A special sequence of characters used for describing non-printing
+characters, such as @samp{\n} for newline, or @samp{\033} for the ASCII
+ESC (escape) character.  @xref{Escape Sequences}.
+
+@item Field
+When @code{awk} reads an input record, it splits the record into pieces
+separated by whitespace (or by a separator regexp which you can
+change by setting the built-in variable @code{FS}).  Such pieces are
+called fields.  If the pieces are of fixed length, you can use the built-in
+variable @code{FIELDWIDTHS} to describe their lengths.
+@xref{Field Separators, ,Specifying How Fields are Separated},
+and also see
+@xref{Constant Size, , Reading Fixed-width Data}.
+
+@item Floating Point Number
+Often referred to in mathematical terms as a ``rational'' number, this is
+just a number that can have a fractional part.
+See ``Double Precision'' and ``Single Precision.''
+
+@item Format
+Format strings are used to control the appearance of output in the
+@code{printf} statement.  Also, data conversions from numbers to strings
+are controlled by the format string contained in the built-in variable
+@code{CONVFMT}.  @xref{Control Letters, ,Format-Control Letters}.
+
+@item Function
+A specialized group of statements used to encapsulate general
+or program-specific tasks.  @code{awk} has a number of built-in
+functions, and also allows you to define your own.
+@xref{Built-in, ,Built-in Functions},
+and @ref{User-defined, ,User-defined Functions}.
+
+@item FSF
+See ``Free Software Foundation.''
+
+@item Free Software Foundation
+A non-profit organization dedicated
+to the production and distribution of freely distributable software.
+It was founded by Richard M.@: Stallman, the author of the original
+Emacs editor.  GNU Emacs is the most widely used version of Emacs today.
+
+@item @code{gawk}
+The GNU implementation of @code{awk}.
+
+@item General Public License
+This document describes the terms under which @code{gawk} and its source
+code may be distributed. (@pxref{Copying, ,GNU GENERAL PUBLIC LICENSE})
+
+@item GNU
+``GNU's not Unix''.  An on-going project of the Free Software Foundation
+to create a complete, freely distributable, POSIX-compliant computing
+environment.
+
+@item GPL
+See ``General Public License.''
+
+@item Hexadecimal
+Base 16 notation, where the digits are @code{0}-@code{9} and
+@code{A}-@code{F}, with @samp{A}
+representing 10, @samp{B} representing 11, and so on up to @samp{F} for 15.
+Hexadecimal numbers are written in C using a leading @samp{0x},
+to indicate their base.  Thus, @code{0x12} is 18 (one times 16 plus 2).
+
+@item I/O
+Abbreviation for ``Input/Output,'' the act of moving data into and/or
+out of a running program.
+
+@item Input Record
+A single chunk of data read in by @code{awk}.  Usually, an @code{awk} input
+record consists of one line of text.
+@xref{Records, ,How Input is Split into Records}.
+
+@item Integer
+A whole number, i.e.@: a number that does not have a fractional part.
+
+@item Keyword
+In the @code{awk} language, a keyword is a word that has special
+meaning.  Keywords are reserved and may not be used as variable names.
+
+@code{gawk}'s keywords are:
+@code{BEGIN},
+@code{END},
+@code{if},
+@code{else},
+@code{while},
+@code{do@dots{}while},
+@code{for},
+@code{for@dots{}in},
+@code{break},
+@code{continue},
+@code{delete},
+@code{next},
+@code{nextfile},
+@code{function},
+@code{func},
+and @code{exit}.
+
+@item Logical Expression
+An expression using the operators for logic, AND, OR, and NOT, written
+@samp{&&}, @samp{||}, and @samp{!} in @code{awk}. Often called Boolean
+expressions, after the mathematician who pioneered this kind of
+mathematical logic.
+
+@item Lvalue
+An expression that can appear on the left side of an assignment
+operator.  In most languages, lvalues can be variables or array
+elements.  In @code{awk}, a field designator can also be used as an
+lvalue.
+
+@item Null String
+A string with no characters in it.  It is represented explicitly in
+@code{awk} programs by placing two double-quote characters next to
+each other (@code{""}).  It can appear in input data by having two successive
+occurrences of the field separator appear next to each other.
+
+@item Number
+A numeric valued data object.  The @code{gawk} implementation uses double
+precision floating point to represent numbers.
+Very old @code{awk} implementations use single precision floating
+point.
+
+@item Octal
+Base-eight notation, where the digits are @code{0}-@code{7}.
+Octal numbers are written in C using a leading @samp{0},
+to indicate their base.  Thus, @code{013} is 11 (one times 8 plus 3).
+
+@item Pattern
+Patterns tell @code{awk} which input records are interesting to which
+rules.
+
+A pattern is an arbitrary conditional expression against which input is
+tested.  If the condition is satisfied, the pattern is said to @dfn{match}
+the input record.  A typical pattern might compare the input record against
+a regular expression.  @xref{Pattern Overview, ,Pattern Elements}.
+
+@item POSIX
+The name for a series of standards being developed by the IEEE
+that specify a Portable Operating System interface.  The ``IX'' denotes
+the Unix heritage of these standards.  The main standard of interest for
+@code{awk} users is
+@cite{IEEE Standard for Information Technology, Standard 1003.2-1992,
+Portable Operating System Interface (POSIX) Part 2: Shell and Utilities}.
+Informally, this standard is often referred to as simply ``P1003.2.''
+
+@item Private
+Variables and/or functions that are meant for use exclusively by library
+functions, and not for the main @code{awk} program. Special care must be
+taken when naming such variables and functions.
+@xref{Library Names,  ,  Naming Library Function Global Variables}.
+
+@item Range (of input lines)
+A sequence of consecutive lines from the input file.  A pattern
+can specify ranges of input lines for @code{awk} to process, or it can
+specify single lines.  @xref{Pattern Overview, ,Pattern Elements}.
+
+@item Recursion
+When a function calls itself, either directly or indirectly.
+If this isn't clear, refer to the entry for ``recursion.''
+
+@item Redirection
+Redirection means performing input from other than the standard input
+stream, or output to other than the standard output stream.
+
+You can redirect the output of the @code{print} and @code{printf} statements
+to a file or a system command, using the @samp{>}, @samp{>>}, and @samp{|}
+operators.  You can redirect input to the @code{getline} statement using
+the @samp{<} and @samp{|} operators.
+@xref{Redirection, ,Redirecting Output of @code{print} and @code{printf}},
+and @ref{Getline, ,Explicit Input with @code{getline}}.
+
+@item Regexp
+Short for @dfn{regular expression}.  A regexp is a pattern that denotes a
+set of strings, possibly an infinite set.  For example, the regexp
+@samp{R.*xp} matches any string starting with the letter @samp{R}
+and ending with the letters @samp{xp}.  In @code{awk}, regexps are
+used in patterns and in conditional expressions.  Regexps may contain
+escape sequences.  @xref{Regexp, ,Regular Expressions}.
+
+@item Regular Expression
+See ``regexp.''
+
+@item Regular Expression Constant
+A regular expression constant is a regular expression written within
+slashes, such as @code{/foo/}.  This regular expression is chosen
+when you write the @code{awk} program, and cannot be changed doing
+its execution.  @xref{Regexp Usage, ,How to Use Regular Expressions}.
+
+@item Rule
+A segment of an @code{awk} program that specifies how to process single
+input records.  A rule consists of a @dfn{pattern} and an @dfn{action}.
+@code{awk} reads an input record; then, for each rule, if the input record
+satisfies the rule's pattern, @code{awk} executes the rule's action.
+Otherwise, the rule does nothing for that input record.
+
+@item Rvalue
+A value that can appear on the right side of an assignment operator.
+In @code{awk}, essentially every expression has a value. These values
+are rvalues.
+
+@item @code{sed}
+See ``Stream Editor.''
+
+@item Short-Circuit
+The nature of the @code{awk} logical operators @samp{&&} and @samp{||}.
+If the value of the entire expression can be deduced from evaluating just
+the left-hand side of these operators, the right-hand side will not
+be evaluated
+(@pxref{Boolean Ops, ,Boolean Expressions}).
+
+@item Side Effect
+A side effect occurs when an expression has an effect aside from merely
+producing a value.  Assignment expressions, increment and decrement
+expressions and function calls have side effects.
+@xref{Assignment Ops, ,Assignment Expressions}.
+
+@item Single Precision
+An internal representation of numbers that can have fractional parts.
+Single precision numbers keep track of fewer digits than do double precision
+numbers, but operations on them are less expensive in terms of CPU time.
+This is the type used by some very old versions of @code{awk} to store
+numeric values.  It is the C type @code{float}.
+
+@item Space
+The character generated by hitting the space bar on the keyboard.
+
+@item Special File
+A file name interpreted internally by @code{gawk}, instead of being handed
+directly to the underlying operating system.  For example, @file{/dev/stderr}.
+@xref{Special Files, ,Special File Names in @code{gawk}}.
+
+@item Stream Editor
+A program that reads records from an input stream and processes them one
+or more at a time.  This is in contrast with batch programs, which may
+expect to read their input files in entirety before starting to do
+anything, and with interactive programs, which require input from the
+user.
+
+@item String
+A datum consisting of a sequence of characters, such as @samp{I am a
+string}.  Constant strings are written with double-quotes in the
+@code{awk} language, and may contain escape sequences.
+@xref{Escape Sequences}.
+
+@item Tab
+The character generated by hitting the @kbd{TAB} key on the keyboard.
+It usually expands to up to eight spaces upon output.
+
+@item Unix
+A computer operating system originally developed in the early 1970's at
+AT&T Bell Laboratories.  It initially became popular in universities around
+the world, and later moved into commercial evnironments as a software
+development system and network server system. There are many commercial
+versions of Unix, as well as several work-alike systems whose source code
+is freely available (such as Linux, NetBSD, and FreeBSD).
+
+@item Whitespace
+A sequence of space or tab characters occurring inside an input record or a
+string.
+@end table
+
+@node Copying, Index, Glossary, Top
+@unnumbered GNU GENERAL PUBLIC LICENSE
+@center Version 2, June 1991
+
+@display
+Copyright @copyright{} 1989, 1991 Free Software Foundation, Inc.
+59 Temple Place --- Suite 330, Boston, MA 02111-1307, USA
+
+Everyone is permitted to copy and distribute verbatim copies
+of this license document, but changing it is not allowed.
+@end display
+
+@c fakenode --- for prepinfo
+@unnumberedsec Preamble
+
+  The licenses for most software are designed to take away your
+freedom to share and change it.  By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software---to make sure the software is free for all its users.  This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it.  (Some other Free Software Foundation software is covered by
+the GNU Library General Public License instead.)  You can apply it to
+your programs, too.
+
+  When we speak of free software, we are referring to freedom, not
+price.  Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+  To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+  For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have.  You must make sure that they, too, receive or can get the
+source code.  And you must show them these terms so they know their
+rights.
+
+  We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+  Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software.  If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+  Finally, any free program is threatened constantly by software
+patents.  We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary.  To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+  The precise terms and conditions for copying, distribution and
+modification follow.
+
+@iftex
+@c fakenode --- for prepinfo
+@unnumberedsec TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+@end iftex
+@ifinfo
+@center TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+@end ifinfo
+
+@enumerate 0
+@item
+This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License.  The ``Program'', below,
+refers to any such program or work, and a ``work based on the Program''
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language.  (Hereinafter, translation is included without limitation in
+the term ``modification''.)  Each licensee is addressed as ``you''.
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope.  The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+@item
+You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+@item
+You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+@enumerate a
+@item
+You must cause the modified files to carry prominent notices
+stating that you changed the files and the date of any change.
+
+@item
+You must cause any work that you distribute or publish, that in
+whole or in part contains or is derived from the Program or any
+part thereof, to be licensed as a whole at no charge to all third
+parties under the terms of this License.
+
+@item
+If the modified program normally reads commands interactively
+when run, you must cause it, when started running for such
+interactive use in the most ordinary way, to print or display an
+announcement including an appropriate copyright notice and a
+notice that there is no warranty (or else, saying that you provide
+a warranty) and that users may redistribute the program under
+these conditions, and telling the user how to view a copy of this
+License.  (Exception: if the Program itself is interactive but
+does not normally print such an announcement, your work based on
+the Program is not required to print an announcement.)
+@end enumerate
+
+These requirements apply to the modified work as a whole.  If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works.  But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+@item
+You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+@enumerate a
+@item
+Accompany it with the complete corresponding machine-readable
+source code, which must be distributed under the terms of Sections
+1 and 2 above on a medium customarily used for software interchange; or,
+
+@item
+Accompany it with a written offer, valid for at least three
+years, to give any third party, for a charge no more than your
+cost of physically performing source distribution, a complete
+machine-readable copy of the corresponding source code, to be
+distributed under the terms of Sections 1 and 2 above on a medium
+customarily used for software interchange; or,
+
+@item
+Accompany it with the information you received as to the offer
+to distribute corresponding source code.  (This alternative is
+allowed only for non-commercial distribution and only if you
+received the program in object code or executable form with such
+an offer, in accord with Subsection b above.)
+@end enumerate
+
+The source code for a work means the preferred form of the work for
+making modifications to it.  For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable.  However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+
+@item
+You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License.  Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+@item
+You are not required to accept this License, since you have not
+signed it.  However, nothing else grants you permission to modify or
+distribute the Program or its derivative works.  These actions are
+prohibited by law if you do not accept this License.  Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+@item
+Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions.  You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+@item
+If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License.  If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all.  For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices.  Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+
+@item
+If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded.  In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+@item
+The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time.  Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number.  If the Program
+specifies a version number of this License which applies to it and ``any
+later version'', you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation.  If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+
+@item
+If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission.  For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this.  Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+@iftex
+@c fakenode --- for prepinfo
+@heading NO WARRANTY
+@end iftex
+@ifinfo
+@center NO WARRANTY
+@end ifinfo
+
+@item
+BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW@.  EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM ``AS IS'' WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE@.  THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU@.  SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+@item
+IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+@end enumerate
+
+@iftex
+@c fakenode --- for prepinfo
+@heading END OF TERMS AND CONDITIONS
+@end iftex
+@ifinfo
+@center END OF TERMS AND CONDITIONS
+@end ifinfo
+
+@page
+@c fakenode --- for prepinfo
+@unnumberedsec How to Apply These Terms to Your New Programs
+
+  If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+  To do so, attach the following notices to the program.  It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the ``copyright'' line and a pointer to where the full notice is found.
+
+@smallexample
+@var{one line to give the program's name and an idea of what it does.}
+Copyright (C) 19@var{yy}  @var{name of author}
+
+This program is free software; you can redistribute it and/or
+modify it under the terms of the GNU General Public License
+as published by the Free Software Foundation; either version 2
+of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE@.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with this program; if not, write to the Free Software
+Foundation, Inc., 59 Temple Place --- Suite 330, Boston, MA 02111-1307, USA.
+@end smallexample
+
+Also add information on how to contact you by electronic and paper mail.
+
+If the program is interactive, make it output a short notice like this
+when it starts in an interactive mode:
+
+@smallexample
+Gnomovision version 69, Copyright (C) 19@var{yy} @var{name of author}
+Gnomovision comes with ABSOLUTELY NO WARRANTY; for details
+type `show w'.  This is free software, and you are welcome
+to redistribute it under certain conditions; type `show c' 
+for details.
+@end smallexample
+
+The hypothetical commands @samp{show w} and @samp{show c} should show
+the appropriate parts of the General Public License.  Of course, the
+commands you use may be called something other than @samp{show w} and
+@samp{show c}; they could even be mouse-clicks or menu items---whatever
+suits your program.
+
+You should also get your employer (if you work as a programmer) or your
+school, if any, to sign a ``copyright disclaimer'' for the program, if
+necessary.  Here is a sample; alter the names:
+
+@smallexample
+@group
+Yoyodyne, Inc., hereby disclaims all copyright
+interest in the program `Gnomovision'
+(which makes passes at compilers) written 
+by James Hacker.
+
+@var{signature of Ty Coon}, 1 April 1989
+Ty Coon, President of Vice
+@end group
+@end smallexample
+
+This General Public License does not permit incorporating your program into
+proprietary programs.  If your program is a subroutine library, you may
+consider it more useful to permit linking proprietary applications with the
+library.  If this is what you want to do, use the GNU Library General
+Public License instead of this License.
+
+@node Index, , Copying, Top
+@unnumbered Index
+@printindex cp
+
+@summarycontents
+@contents
+@bye
+
+Unresolved Issues:
+------------------
+1. From ADR.
+
+   Robert J. Chassell points out that awk programs should have some indication
+   of how to use them.  It would be useful to perhaps have a "programming
+   style" section of the manual that would include this and other tips.
+
+2. The default AWKPATH search path should be configurable via `configure'
+   The default and how this changes needs to be documented.
+
+Consistency issues:
+	/.../ regexps are in @code, not @samp
+	".." strings are in @code, not @samp
+	no @print before @dots
+	values of expressions in the text (@code{x} has the value 15),
+		should be in roman, not @code
+	Use   tab   and not   TAB
+	Use   ESC   and not   ESCAPE
+	Use   space and not   blank	to describe the space bar's character
+	The term "blank" is thus basically reserved for "blank lines" etc.
+	The `(d.c.)' should appear inside the closing `.' of a sentence
+		It should come before (pxref{...})
+	" " should have an @w{} around it
+	Use "non-" everywhere
+	Use @code{ftp} when talking about anonymous ftp
+	Use upper-case and lower-case, not "upper case" and "lower case"
+	Use alphanumeric, not alpha-numeric
+	Use --foo, not -Wfoo when describing long options
+	Use findex for all programs and functions in the example chapters
+	Use "Bell Labs" or "AT&T Bell Laboratories", but not
+		"AT&T Bell Labs".
+	Use "behavior" instead of "behaviour".
+	Use "zeros" instead of "zeroes".
+	Use "Input/Output", not "input/output". Also "I/O", not "i/o".
+	Use @code{do}, and not @code{do}-@code{while}, except where
+		actually discussing the do-while.
+	The words "a", "and", "as", "between", "for", "from", "in", "of",
+		"on", "that", "the", "to", "with", and "without",
+		should not be capitalized in @chapter, @section etc.
+		"Into" and "How" should.
+	Search for @dfn; make sure important items are also indexed.
+	"e.g." should always be followed by a comma.
+	"i.e." should never be followed by a comma, and should be followed
+		by `@:'.
+	The numbers zero through ten should be spelled out, except when
+		talking about file descriptor numbers. > 10 and < 0, it's
+		ok to use numbers.
+	In tables, put command line options in @code, while in the text,
+		put them in @samp.
+	When using @strong, use "Note:" or "Caution:" with colons and
+		not exclamation points.  Do not surround the paragraphs
+		with @quotation ... @end quotation.
+
+Date: Wed, 13 Apr 94 15:20:52 -0400
+From: rsm@gnu.ai.mit.edu (Richard Stallman)
+To: gnu-prog@gnu.ai.mit.edu
+Subject: A reminder: no pathnames in GNU
+
+It's a GNU convention to use the term "file name" for the name of a
+file, never "pathname".  We use the term "path" for search paths,
+which are lists of file names.  Using it for a single file name as
+well is potentially confusing to users.
+
+So please check any documentation you maintain, if you think you might
+have used "pathname".
+
+Note that "file name" should be two words when it appears as ordinary
+text.  It's ok as one word when it's a metasyntactic variable, though.
+
+Suggestions:
+------------
+Enhance FIELDWIDTHS with some way to indicate "the rest of the record".
+E.g., a length of 0 or -1 or something.  May be "n"?
+
+Make FIELDWIDTHS be an array?
+
+What if FIELDWIDTHS has invalid values in it?
author	Arnold D. Robbins <arnold@skeeve.com>	2010-07-16 12:41:09 +0300
committer	Arnold D. Robbins <arnold@skeeve.com>	2010-07-16 12:41:09 +0300
commit	8c042f99cc7465c86351d21331a129111b75345d (patch)
tree	9656e653be0e42e5469cec77635c20356de152c2 /doc/gawk.texi
parent	8ceb5f934787eb7be5fb452fb39179df66119954 (diff)
download	gawk-8c042f99cc7465c86351d21331a129111b75345d.tar.gz