summaryrefslogtreecommitdiff
path: root/doc/gawk.texi
diff options
context:
space:
mode:
Diffstat (limited to 'doc/gawk.texi')
-rw-r--r--doc/gawk.texi423
1 files changed, 296 insertions, 127 deletions
diff --git a/doc/gawk.texi b/doc/gawk.texi
index 30ba377b..aa9e3ee4 100644
--- a/doc/gawk.texi
+++ b/doc/gawk.texi
@@ -2070,11 +2070,11 @@ $ @kbd{awk "BEGIN @{ print \"Don't Panic!\" @}"}
@print{} Don't Panic!
@end example
-@cindex quoting
-@cindex double quote (@code{"})
-@cindex @code{"} (double quote)
-@cindex @code{\} (backslash)
-@cindex backslash (@code{\})
+@cindex shell quoting, double quote
+@cindex double quote (@code{"}) in shell commands
+@cindex @code{"} (double quote) in shell commands
+@cindex @code{\} (backslash) in shell commands
+@cindex backslash (@code{\}) in shell commands
This program does not read any input. The @samp{\} before each of the
inner double quotes is necessary because of the shell's quoting
rules---in particular because it mixes both single quotes and
@@ -2114,8 +2114,7 @@ awk -f @var{source-file} @var{input-file1} @var{input-file2} @dots{}
@end example
@cindex @option{-f} option
-@cindex command line, options
-@cindex options, command-line
+@cindex command line, option @option{-f}
The @option{-f} instructs the @command{awk} utility to get the @command{awk} program
from the file @var{source-file}. Any file name can be used for
@var{source-file}. For example, you could put the program:
@@ -2138,7 +2137,7 @@ does the same thing as this one:
awk "BEGIN @{ print \"Don't Panic!\" @}"
@end example
-@cindex quoting
+@cindex quoting in @command{gawk} command lines
@noindent
This was explained earlier
(@pxref{Read Terminal}).
@@ -2149,9 +2148,9 @@ program did not have single quotes around it. The quotes are only needed
for programs that are provided on the @command{awk} command line.
@c STARTOFRANGE sq1x
-@cindex single quote (@code{'})
+@cindex single quote (@code{'}) in @command{gawk} command lines
@c STARTOFRANGE qs2x
-@cindex @code{'} (single quote)
+@cindex @code{'} (single quote) in @command{gawk} command lines
If you want to clearly identify your @command{awk} program files as such,
you can add the extension @file{.awk} to the file name. This doesn't
affect the execution of the @command{awk} program but it does make
@@ -2300,7 +2299,7 @@ programs, but this usually isn't very useful; the purpose of a
comment is to help you or another person understand the program
when reading it at a later time.
-@cindex quoting
+@cindex quoting, for small awk programs
@cindex single quote (@code{'}), vs.@: apostrophe
@cindex @code{'} (single quote), vs.@: apostrophe
@quotation CAUTION
@@ -2341,7 +2340,7 @@ The next @value{SUBSECTION} describes the shell's quoting rules.
@node Quoting
@subsection Shell-Quoting Issues
-@cindex quoting, rules for
+@cindex shell quoting, rules for
@menu
* DOS Quoting:: Quoting in Windows Batch Files.
@@ -2376,10 +2375,10 @@ that character. The shell removes the backslash and passes the quoted
character on to the command.
@item
-@cindex @code{\} (backslash)
-@cindex backslash (@code{\})
-@cindex single quote (@code{'})
-@cindex @code{'} (single quote)
+@cindex @code{\} (backslash), in shell commands
+@cindex backslash (@code{\}), in shell commands
+@cindex single quote (@code{'}), in shell commands
+@cindex @code{'} (single quote), in shell commands
Single quotes protect everything between the opening and closing quotes.
The shell does no interpretation of the quoted text, passing it on verbatim
to the command.
@@ -2389,8 +2388,8 @@ Refer back to
for an example of what happens if you try.
@item
-@cindex double quote (@code{"})
-@cindex @code{"} (double quote)
+@cindex double quote (@code{"}), in shell commands
+@cindex @code{"} (double quote), in shell commands
Double quotes protect most things between the opening and closing quotes.
The shell does at least variable and command substitution on the quoted text.
Different shells may do additional kinds of processing on double-quoted text.
@@ -2427,7 +2426,7 @@ awk -F "" '@var{program}' @var{files} # correct
@end example
@noindent
-@cindex null strings, quoting and
+@cindex null strings in @command{gawk} arguments, quoting and
Don't use this:
@example
@@ -2440,7 +2439,7 @@ as the value of @code{FS}, and the first file name as the text of the program!
This results in syntax errors at best, and confusing behavior at worst.
@end itemize
-@cindex quoting, tricks for
+@cindex quoting in @command{gawk} command lines, tricks for
Mixing single and double quotes is difficult. You have to resort
to shell quoting tricks, like this:
@@ -3322,6 +3321,7 @@ Print the short version of the General Public License and then exit.
@itemx --dump-variables@r{[}=@var{file}@r{]}
@cindex @option{-d} option
@cindex @option{--dump-variables} option
+@cindex dump all variables of a program
@cindex @file{awkvars.out} file
@cindex files, @file{awkvars.out}
@cindex variables, global, printing list of
@@ -3475,7 +3475,7 @@ care to search for all occurrences of each inappropriate construct. As
@cindex @option{--bignum} option
Force arbitrary precision arithmetic on numbers. This option has no effect
if @command{gawk} is not compiled to use the GNU MPFR and MP libraries
-(@pxref{Arbitrary Precision Arithmetic}).
+(@pxref{Gawk and MPFR}).
@item -n
@itemx --non-decimal-data
@@ -3728,6 +3728,7 @@ file at all.
@cindex @command{gawk}, @code{ARGIND} variable in
@cindex @code{ARGIND} variable, command-line arguments
+@cindex @code{ARGV} array, indexing into
@cindex @code{ARGC}/@code{ARGV} variables, command-line arguments
All these arguments are made available to your @command{awk} program in the
@code{ARGV} array (@pxref{Built-in Variables}). Command-line options
@@ -3738,6 +3739,7 @@ sets the variable @code{ARGIND} to the index in @code{ARGV} of the
current element.
@cindex input files, variable assignments and
+@cindex variable assignments and input files
The distinction between file name arguments and variable-assignment
arguments is made when @command{awk} is about to open the next input file.
At that point in execution, it checks the file name to see whether
@@ -3815,6 +3817,7 @@ this file name itself.)
@node Environment Variables
@section The Environment Variables @command{gawk} Uses
+@cindex environment variables used by @command{gawk}
A number of environment variables influence how @command{gawk}
behaves.
@@ -3830,8 +3833,7 @@ behaves.
@node AWKPATH Variable
@subsection The @env{AWKPATH} Environment Variable
@cindex @env{AWKPATH} environment variable
-@cindex directories, searching
-@cindex search paths
+@cindex directories, searching for source files
@cindex search paths, for source files
@cindex differences in @command{awk} and @command{gawk}, @code{AWKPATH} environment variable
@ifinfo
@@ -3843,12 +3845,12 @@ implementations, you must supply a precise path name for each program
file, unless the file is in the current directory.
But in @command{gawk}, if the file name supplied to the @option{-f}
or @option{-i} options
-does not contain a @samp{/}, then @command{gawk} searches a list of
+does not contain a directory separator @samp{/}, then @command{gawk} searches a list of
directories (called the @dfn{search path}), one by one, looking for a
file with the specified name.
The search path is a string consisting of directory names
-separated by colons. @command{gawk} gets its search path from the
+separated by colons@footnote{Semicolons on MS-Windows and MS-DOS.}. @command{gawk} gets its search path from the
@env{AWKPATH} environment variable. If that variable does not exist,
@command{gawk} uses a default path,
@samp{.:/usr/local/share/awk}.@footnote{Your version of @command{gawk}
@@ -3906,8 +3908,7 @@ found, and @command{gawk} no longer needs to use @env{AWKPATH}.
@node AWKLIBPATH Variable
@subsection The @env{AWKLIBPATH} Environment Variable
@cindex @env{AWKLIBPATH} environment variable
-@cindex directories, searching
-@cindex search paths
+@cindex directories, searching for shared libraries
@cindex search paths, for shared libraries
@cindex differences in @command{awk} and @command{gawk}, @code{AWKLIBPATH} environment variable
@@ -4192,7 +4193,6 @@ they will @emph{not} be in the next release).
@c update this section for each release!
-@cindex @code{PROCINFO} array
The process-related special files @file{/dev/pid}, @file{/dev/ppid},
@file{/dev/pgrpid}, and @file{/dev/user} were deprecated in @command{gawk}
3.1, but still worked. As of version 4.0, they are no longer
@@ -4277,7 +4277,7 @@ long-undocumented ``feature'' of Unix @code{awk}.
@node Regexp
@chapter Regular Expressions
-@cindex regexp, See regular expressions
+@cindex regexp
@c STARTOFRANGE regexp
@cindex regular expressions
@@ -4286,8 +4286,8 @@ set of strings.
Because regular expressions are such a fundamental part of @command{awk}
programming, their format and use deserve a separate @value{CHAPTER}.
-@cindex forward slash (@code{/})
-@cindex @code{/} (forward slash)
+@cindex forward slash (@code{/}) to enclose regular expressions
+@cindex @code{/} (forward slash) to enclose regular expressions
A regular expression enclosed in slashes (@samp{/})
is an @command{awk} pattern that matches every input record whose text
belongs to that set.
@@ -4343,9 +4343,9 @@ $ @kbd{awk '/li/ @{ print $2 @}' mail-list}
@cindex @code{!} (exclamation point), @code{!~} operator
@cindex exclamation point (@code{!}), @code{!~} operator
@c @cindex operators, @code{!~}
-@cindex @code{if} statement
-@cindex @code{while} statement
-@cindex @code{do}-@code{while} statement
+@cindex @code{if} statement, use of regexps in
+@cindex @code{while} statement, use of regexps in
+@cindex @code{do}-@code{while} statement, use of regexps in
@c @cindex statements, @code{if}
@c @cindex statements, @code{while}
@c @cindex statements, @code{do}
@@ -4404,6 +4404,7 @@ $ @kbd{awk '$1 !~ /J/' inventory-shipped}
@end example
@cindex regexp constants
+@cindex constant regexps
@cindex regular expressions, constants, See regexp constants
When a regexp is enclosed in slashes, such as @code{/foo/}, we call it
a @dfn{regexp constant}, much like @code{5.27} is a numeric constant and
@@ -4412,7 +4413,7 @@ a @dfn{regexp constant}, much like @code{5.27} is a numeric constant and
@node Escape Sequences
@section Escape Sequences
-@cindex escape sequences
+@cindex escape sequences, in strings
@cindex backslash (@code{\}), in escape sequences
@cindex @code{\} (backslash), in escape sequences
Some characters cannot be included literally in string constants
@@ -4706,6 +4707,7 @@ escape sequences literally when used in regexp constants. Thus,
@section Regular Expression Operators
@c STARTOFRANGE regexpo
@cindex regular expressions, operators
+@cindex metacharacters in regular expressions
You can combine regular expressions with special characters,
called @dfn{regular expression operators} or @dfn{metacharacters}, to
@@ -4724,8 +4726,8 @@ Here is a list of metacharacters. All characters that are not escape
sequences and that are not listed in the table stand for themselves:
@table @code
-@cindex backslash (@code{\})
-@cindex @code{\} (backslash)
+@cindex backslash (@code{\}), regexp operator
+@cindex @code{\} (backslash), regexp operator
@item \
This is used to suppress the special meaning of a character when
matching. For example, @samp{\$}
@@ -4763,8 +4765,8 @@ The condition in the following example is not true:
if ("line1\nLINE 2" ~ /1$/) @dots{}
@end example
-@cindex @code{.} (period)
-@cindex period (@code{.})
+@cindex @code{.} (period), regexp operator
+@cindex period (@code{.}), regexp operator
@item . @r{(period)}
This matches any single character,
@emph{including} the newline character. For example, @samp{.P}
@@ -4780,8 +4782,8 @@ character, which is a character with all bits equal to zero.
Otherwise, @sc{nul} is just another character. Other versions of @command{awk}
may not be able to match the @sc{nul} character.
-@cindex @code{[]} (square brackets)
-@cindex square brackets (@code{[]})
+@cindex @code{[]} (square brackets), regexp operator
+@cindex square brackets (@code{[]}), regexp operator
@cindex bracket expressions
@cindex character sets, See Also bracket expressions
@cindex character lists, See bracket expressions
@@ -4868,7 +4870,7 @@ This symbol is similar to @samp{*}, except that the preceding expression can be
matched either once or not at all. For example, @samp{fe?d}
matches @samp{fed} and @samp{fd}, but nothing else.
-@cindex interval expressions
+@cindex interval expressions, regexp operator
@item @{@var{n}@}
@itemx @{@var{n},@}
@itemx @{@var{n},@var{m}@}
@@ -4945,6 +4947,7 @@ expressions are not available in regular expressions.
@cindex bracket expressions
@cindex bracket expressions, range expressions
@cindex range expressions (regexps)
+@cindex character lists in regular expression
As mentioned earlier, a bracket expression matches any character amongst
those listed between the opening and closing square brackets.
@@ -5208,7 +5211,7 @@ lesser of two evils.
@c
@c Should really do this with file inclusion.
@cindex regular expressions, @command{gawk}, command-line options
-@cindex @command{gawk}, command-line options
+@cindex @command{gawk}, command-line options, and regular expressions
The various command-line options
(@pxref{Options})
control how @command{gawk} interprets characters in regexps:
@@ -5287,7 +5290,7 @@ This works in any POSIX-compliant @command{awk}.
@cindex tilde (@code{~}), @code{~} operator
@cindex @code{!} (exclamation point), @code{!~} operator
@cindex exclamation point (@code{!}), @code{!~} operator
-@cindex @code{IGNORECASE} variable
+@cindex @code{IGNORECASE} variable, @code{~} and @code{!~} operators
@cindex @command{gawk}, @code{IGNORECASE} variable in
@c @cindex variables, @code{IGNORECASE}
Another method, specific to @command{gawk}, is to set the variable
@@ -5552,6 +5555,7 @@ occur often in practice, but it's worth noting for future reference.
@chapter Reading Input Files
@c STARTOFRANGE infir
+@cindex reading input files
@cindex input files, reading
@cindex input files
@cindex @code{FILENAME} variable
@@ -5638,7 +5642,6 @@ To do this, use the special @code{BEGIN} pattern
(@pxref{BEGIN/END}).
For example:
-@cindex @code{BEGIN} pattern
@example
awk 'BEGIN @{ RS = "u" @}
@{ print $0 @}' mail-list
@@ -5754,6 +5757,7 @@ Reaching the end of an input file terminates the current input record,
even if the last character in the file is not the character in @code{RS}.
@value{DARKCORNER}
+@cindex empty strings
@cindex null strings
@cindex strings, empty, See null strings
The empty string @code{""} (a string without any characters)
@@ -5890,7 +5894,7 @@ character as a record separator. However, this is a special case:
@command{mawk} does not allow embedded @sc{nul} characters in strings.
@cindex records, treating files as
-@cindex files, as single records
+@cindex treating files, as single records
The best way to treat a whole file as a single record is to
simply read the file in, one record at a time, concatenating each
record onto the end of the previous ones.
@@ -5941,7 +5945,7 @@ character as a record separator. However, this is a special case:
@command{mawk} does not allow embedded @sc{nul} characters in strings.
@cindex records, treating files as
-@cindex files, as single records
+@cindex treating files, as single records
The best way to treat a whole file as a single record is to
simply read the file in, one record at a time, concatenating each
record onto the end of the previous ones.
@@ -6588,10 +6592,8 @@ behaves this way.
@node Command Line Field Separator
@subsection Setting @code{FS} from the Command Line
-@cindex @option{-F} option
-@cindex options, command-line
-@cindex command line, options
-@cindex field separators, on command line
+@cindex @option{-F} option, command line
+@cindex field separator, on command line
@cindex command line, @code{FS} on@comma{} setting
@cindex @code{FS} variable, setting from command line
@@ -6756,7 +6758,7 @@ POSIX standard.)
@cindex POSIX @command{awk}, field separators and
-@cindex field separators, POSIX and
+@cindex field separator, POSIX and
According to the POSIX standard, @command{awk} is supposed to behave
as if each record is split into fields at the time it is read.
In particular, this means that if you change the value of @code{FS}
@@ -6809,7 +6811,7 @@ root:nSijPlPhZZwgE:0:0:Root:/:
@cindex POSIX @command{awk}, field separators and
-@cindex field separators, POSIX and
+@cindex field separator, POSIX and
According to the POSIX standard, @command{awk} is supposed to behave
as if each record is split into fields at the time it is read.
In particular, this means that if you change the value of @code{FS}
@@ -7169,6 +7171,7 @@ available for splitting regular strings (@pxref{String Functions}).
@node Multiple Line
@section Multiple-Line Records
+@cindex multiple-line records
@c STARTOFRANGE recm
@cindex records, multiline
@c STARTOFRANGE imr
@@ -7220,7 +7223,8 @@ after the last record, the final newline is removed from the record.
In the second case, this special processing is not done.
@value{DARKCORNER}
-@cindex field separators, in multiline records
+@cindex field separator, in multiline records
+@cindex @code{FS}, in multiline records
Now that the input is separated into records, the second step is to
separate the fields in the record. One way to do this is to divide each
of the lines into fields in the normal manner. This happens by default
@@ -7368,7 +7372,7 @@ and study the @code{getline} command @emph{after} you have reviewed the
rest of this @value{DOCUMENT} and have a good knowledge of how @command{awk} works.
@cindex @command{gawk}, @code{ERRNO} variable in
-@cindex @code{ERRNO} variable
+@cindex @code{ERRNO} variable, with @command{getline} command
@cindex differences in @command{awk} and @command{gawk}, @code{getline} command
@cindex @code{getline} command, return values
@cindex @option{--sandbox} option, input redirection with @code{getline}
@@ -7464,6 +7468,7 @@ rule in the program. @xref{Next Statement}.
@node Getline/Variable
@subsection Using @code{getline} into a Variable
+@cindex @code{getline} into a variable
@cindex variables, @code{getline} command into@comma{} using
You can use @samp{getline @var{var}} to read the next record from
@@ -7515,6 +7520,7 @@ the value of @code{NF} do not change.
@node Getline/File
@subsection Using @code{getline} from a File
+@cindex @code{getline} from a file
@cindex input redirection
@cindex redirection of input
@cindex @code{<} (left angle bracket), @code{<} operator (I/O)
@@ -7563,8 +7569,6 @@ from the file
@var{file}, and put it in the variable @var{var}. As above, @var{file}
is a string-valued expression that specifies the file from which to read.
-@cindex @command{gawk}, @code{RT} variable in
-@cindex @code{RT} variable
In this version of @code{getline}, none of the built-in variables are
changed and the record is not split into fields. The only variable
changed is @var{var}.@footnote{This is not quite true. @code{RT} could
@@ -7589,7 +7593,6 @@ Note here how the name of the extra input file is not built into
the program; it is taken directly from the data, specifically from the second field on
the @samp{@@include} line.
-@cindex @code{close()} function
The @code{close()} function is called to ensure that if two identical
@samp{@@include} lines appear in the input, the entire specified file is
included twice.
@@ -7616,7 +7619,7 @@ Failing that, attention to details would be useful.}
@cindex @code{|} (vertical bar), @code{|} operator (I/O)
@cindex vertical bar (@code{|}), @code{|} operator (I/O)
@cindex input pipeline
-@cindex pipes, input
+@cindex pipe, input
@cindex operators, input/output
The output of a command can also be piped into @code{getline}, using
@samp{@var{command} | getline}. In
@@ -7640,7 +7643,6 @@ produced by running the rest of the line as a shell command:
@end example
@noindent
-@cindex @code{close()} function
The @code{close()} function is called to ensure that if two identical
@samp{@@execute} lines appear in the input, the command is run for
each one.
@@ -8917,7 +8919,7 @@ appended to the file.
If @var{output-file} does not exist, then it is created.
@cindex @code{|} (vertical bar), @code{|} operator (I/O)
-@cindex pipes, output
+@cindex pipe, output
@cindex output, pipes
@item print @var{items} | @var{command}
It is possible to send output to another program through a pipe
@@ -9292,7 +9294,7 @@ Doing so results in unpredictable behavior.
@c STARTOFRANGE ofc
@cindex output, files@comma{} closing
@c STARTOFRANGE pc
-@cindex pipes, closing
+@cindex pipe, closing
@c STARTOFRANGE cc
@cindex coprocesses, closing
@cindex @code{getline} command, coprocesses@comma{} using from
@@ -9395,6 +9397,7 @@ a separate message.
@cindex differences in @command{awk} and @command{gawk}, @code{close()} function
@cindex portability, @code{close()} function and
+@cindex @code{close()} function, portability
If you use more files than the system allows you to have open,
@command{gawk} attempts to multiplex the available open files among
your data files. @command{gawk}'s ability to do this depends upon the
@@ -9479,7 +9482,7 @@ retval = close(command) # syntax error in many Unix awks
@end example
@cindex @command{gawk}, @code{ERRNO} variable in
-@cindex @code{ERRNO} variable
+@cindex @code{ERRNO} variable, with @command{close()} function
@command{gawk} treats @code{close()} as a function.
The return value is @minus{}1 if the argument names something
that was never opened with a redirection, or if there is
@@ -9535,7 +9538,7 @@ retval = close(command) # syntax error in many Unix awks
@end example
@cindex @command{gawk}, @code{ERRNO} variable in
-@cindex @code{ERRNO} variable
+@cindex @code{ERRNO} variable, with @command{close()} function
@command{gawk} treats @code{close()} as a function.
The return value is @minus{}1 if the argument names something
that was never opened with a redirection, or if there is
@@ -9635,7 +9638,8 @@ have different forms, but are stored identically internally.
@node Scalar Constants
@subsubsection Numeric and String Constants
-@cindex numeric, constants
+@cindex constants, numeric
+@cindex numeric constants
A @dfn{numeric constant} stands for a number. This number can be an
integer, a decimal fraction, or a number in scientific (exponential)
notation.@footnote{The internal representation of all numbers,
@@ -9661,7 +9665,7 @@ double-quotation marks. For example:
@noindent
@cindex differences in @command{awk} and @command{gawk}, strings
-@cindex strings, length of
+@cindex strings, length limitations
represents the string whose contents are @samp{parrot}. Strings in
@command{gawk} can be of any length, and they can contain any of the possible
eight-bit ASCII characters including ASCII @sc{nul} (character code zero).
@@ -12249,6 +12253,8 @@ programmers.
@node Using BEGIN/END
@subsubsection Startup and Cleanup Actions
+@cindex @code{BEGIN} pattern
+@cindex @code{END} pattern
A @code{BEGIN} rule is executed once only, before the first input record
is read. Likewise, an @code{END} rule is executed once only, after all the
input is read. For example:
@@ -12392,7 +12398,7 @@ you can bypass the fatal error and move on to the next file on the
command line.
@cindex @command{gawk}, @code{ERRNO} variable in
-@cindex @code{ERRNO} variable
+@cindex @code{ERRNO} variable, with @code{BEGINFILE} pattern
@cindex @code{nextfile} statement, @code{BEGINFILE}/@code{ENDFILE} patterns and
You do this by checking if the @code{ERRNO} variable is not the empty
string; if so, then @command{gawk} was not able to open the file. In
@@ -13501,8 +13507,8 @@ is to simply say @samp{FS = FS}, perhaps with an explanatory comment.
@cindex @command{gawk}, @code{IGNORECASE} variable in
@cindex @code{IGNORECASE} variable
@cindex differences in @command{awk} and @command{gawk}, @code{IGNORECASE} variable
-@cindex case sensitivity, string comparisons and
-@cindex case sensitivity, regexps and
+@cindex case sensitivity, and string comparisons
+@cindex case sensitivity, and regexps
@cindex regular expressions, case sensitivity
@item IGNORECASE #
If @code{IGNORECASE} is nonzero or non-null, then all string comparisons
@@ -13719,7 +13725,7 @@ or if @command{gawk} is in compatibility mode
it is not special.
@cindex @code{ENVIRON} array
-@cindex environment variables
+@cindex environment variables, in @code{ENVIRON} array
@item ENVIRON
An associative array containing the values of the environment. The array
indices are the environment variable names; the elements are the values of
@@ -13834,10 +13840,12 @@ The following elements (listed alphabetically)
are guaranteed to be available:
@table @code
+@cindex effective group id of @command{gawk} user
@item PROCINFO["egid"]
The value of the @code{getegid()} system call.
@item PROCINFO["euid"]
+@cindex effective user id of @command{gawk} user
The value of the @code{geteuid()} system call.
@item PROCINFO["FS"]
@@ -13847,6 +13855,7 @@ This is
or @code{"FPAT"} if field matching with @code{FPAT} is in effect.
@item PROCINFO["identifiers"]
+@cindex program identifiers
A subarray, indexed by the names of all identifiers used in the
text of the AWK program. For each identifier, the value of the element is one of the following:
@@ -13875,15 +13884,19 @@ after it has finished parsing the program; they are @emph{not} updated
while the program runs.
@item PROCINFO["gid"]
+@cindex group id of @command{gawk} user
The value of the @code{getgid()} system call.
@item PROCINFO["pgrpid"]
+@cindex process group id of @command{gawk} process
The process group ID of the current process.
@item PROCINFO["pid"]
+@cindex process id of @command{gawk} process
The process ID of the current process.
@item PROCINFO["ppid"]
+@cindex parent process id of @command{gawk} process
The parent process ID of the current process.
@item PROCINFO["sorted_in"]
@@ -13903,25 +13916,31 @@ Assigning a new value to this element changes the default.
The value of the @code{getuid()} system call.
@item PROCINFO["version"]
+@cindex version of @command{gawk}
+@cindex @command{gawk} version
The version of @command{gawk}.
@end table
The following additional elements in the array
are available to provide information about the MPFR and GMP libraries
if your version of @command{gawk} supports arbitrary precision numbers
-(@pxref{Arbitrary Precision Arithmetic}):
+(@pxref{Gawk and MPFR}):
@table @code
+@cindex version of GNU MPFR library
@item PROCINFO["mpfr_version"]
The version of the GNU MPFR library.
@item PROCINFO["gmp_version"]
+@cindex version of GNU MP library
The version of the GNU MP library.
@item PROCINFO["prec_max"]
+@cindex maximum precision supported by MPFR library
The maximum precision supported by MPFR.
@item PROCINFO["prec_min"]
+@cindex minimum precision supported by MPFR library
The minimum precision required by MPFR.
@end table
@@ -13932,12 +13951,15 @@ of @command{gawk} supports dynamic loading of extension functions
@table @code
@item PROCINFO["api_major"]
+@cindex version of @command{gawk} extension API
+@cindex extension API, version number
The major version of the extension API.
@item PROCINFO["api_minor"]
The minor version of the extension API.
@end table
+@cindex supplementary groups of @command{gawk} process
On some systems, there may be elements in the array, @code{"group1"}
through @code{"group@var{N}"} for some @var{N}. @var{N} is the number of
supplementary groups that the process has. Use the @code{in} operator
@@ -13945,7 +13967,7 @@ to test for these elements
(@pxref{Reference to Elements}).
@cindex @command{gawk}, @code{PROCINFO} array in
-@cindex @code{PROCINFO} array
+@cindex @code{PROCINFO} array, uses
The @code{PROCINFO} array has the following additional uses:
@itemize @bullet
@@ -14131,7 +14153,7 @@ changed.
@node ARGC and ARGV
@subsection Using @code{ARGC} and @code{ARGV}
-@cindex @code{ARGC}/@code{ARGV} variables
+@cindex @code{ARGC}/@code{ARGV} variables, how to use
@cindex arguments, command-line
@cindex command line, arguments
@@ -14276,7 +14298,7 @@ ability to support true multidimensional arrays.
@cindex variables, names of
@cindex functions, names of
-@cindex arrays, names of
+@cindex arrays, names of, and names of functions/variables
@cindex names, arrays/variables
@cindex namespace issues
@command{awk} maintains a single set
@@ -14452,10 +14474,9 @@ Here, the number @code{1} isn't double-quoted, since @command{awk}
automatically converts it to a string.
@cindex @command{gawk}, @code{IGNORECASE} variable in
-@cindex @code{IGNORECASE} variable
@cindex case sensitivity, array indices and
-@cindex arrays, @code{IGNORECASE} variable and
-@cindex @code{IGNORECASE} variable, array subscripts and
+@cindex arrays, and @code{IGNORECASE} variable
+@cindex @code{IGNORECASE} variable, and array indices
The value of @code{IGNORECASE} has no effect upon array subscripting.
The identical string value used to store an array element must be used
to retrieve it.
@@ -14471,8 +14492,9 @@ is independent of the number of elements in the array.
@node Reference to Elements
@subsection Referring to an Array Element
-@cindex arrays, elements, referencing
-@cindex elements in arrays
+@cindex arrays, referencing elements
+@cindex array members
+@cindex elements of arrays
The principal way to use an array is to refer to one of its elements.
An array reference is an expression as follows:
@@ -14489,11 +14511,16 @@ The value of the array reference is the current value of that array
element. For example, @code{foo[4.3]} is an expression for the element
of array @code{foo} at index @samp{4.3}.
+@cindex arrays, unassigned elements
+@cindex unassigned array elements
+@cindex empty array elements
A reference to an array element that has no recorded value yields a value of
@code{""}, the null string. This includes elements
that have not been assigned any value as well as elements that have been
deleted (@pxref{Delete}).
+@cindex non-existent array elements
+@cindex arrays, elements that don't exist
@quotation NOTE
A reference to an element that does not exist @emph{automatically} creates
that array element, with the null string as its value. (In some cases,
@@ -14513,7 +14540,7 @@ if it didn't exist before!
@end quotation
@c @cindex arrays, @code{in} operator and
-@cindex @code{in} operator
+@cindex @code{in} operator, testing if array element exists
To determine whether an element exists in an array at a certain index, use
the following expression:
@@ -14548,8 +14575,8 @@ if (frequencies[2] != "")
@node Assigning Elements
@subsection Assigning Array Elements
-@cindex arrays, elements, assigning
-@cindex elements in arrays, assigning
+@cindex arrays, elements, assigning values
+@cindex elements in arrays, assigning values
Array elements can be assigned values just like
@command{awk} variables:
@@ -14566,6 +14593,7 @@ assign to that element of the array.
@node Array Example
@subsection Basic Array Example
+@cindex arrays, an example of using
The following program takes a list of lines, each beginning with a line
number, and prints them out in order of line number. The line numbers
@@ -14635,6 +14663,7 @@ END @{
@node Scanning an Array
@subsection Scanning All Elements of an Array
@cindex elements in arrays, scanning
+@cindex scanning arrays
@cindex arrays, scanning
@cindex loops, @code{for}, array scanning
@@ -14653,7 +14682,7 @@ for (@var{var} in @var{array})
@end example
@noindent
-@cindex @code{in} operator
+@cindex @code{in} operator, use in loops
This loop executes @var{body} once for each index in @var{array} that the
program has previously used, with the variable @var{var} set to that index.
@@ -14692,8 +14721,9 @@ END @{
@xref{Word Sorting},
for a more detailed example of this type.
-@cindex arrays, elements, order of
-@cindex elements in arrays, order of
+@cindex arrays, elements, order of access by @code{in} operator
+@cindex elements in arrays, order of access by @code{in} operator
+@cindex @code{in} operator, order of array access
The order in which elements of the array are accessed by this statement
is determined by the internal arrangement of the array elements within
@command{awk} and normally cannot be controlled or changed. This can lead to
@@ -14711,6 +14741,8 @@ determines the order in which the array is traversed.
This order is usually based on the internal implementation of arrays
and will vary from one version of @command{awk} to the next.
+@cindex array scanning order, controlling
+@cindex controlling array scanning order
Often, though, you may wish to do something simple, such as
``traverse the array by comparing the indices in ascending order,''
or ``traverse the array by comparing the values in descending order.''
@@ -14727,6 +14759,7 @@ to use for comparison of array elements. This advanced feature
is described later, in @ref{Array Sorting}.
@end itemize
+@cindex @code{PROCINFO}, values of @code{sorted_in}
The following special values for @code{PROCINFO["sorted_in"]} are available:
@table @code
@@ -14887,7 +14920,7 @@ if (4 in foo)
print "This will never be printed"
@end example
-@cindex null strings, array elements and
+@cindex null strings, and deleting array elements
It is important to note that deleting an element is @emph{not} the
same as assigning it a null value (the empty string, @code{""}).
For example:
@@ -14909,6 +14942,7 @@ is not in the array is deleted.
@cindex extensions, common@comma{} @code{delete} to delete entire arrays
@cindex arrays, deleting entire contents
@cindex deleting entire arrays
+@cindex @code{delete} @var{array}
@cindex differences in @command{awk} and @command{gawk}, array elements, deleting
All the elements of an array may be deleted with a single statement
by leaving off the subscript in the @code{delete} statement,
@@ -14966,9 +15000,9 @@ a = 3
@section Using Numbers to Subscript Arrays
@cindex numbers, as array subscripts
-@cindex arrays, subscripts
+@cindex arrays, numeric subscripts
@cindex subscripts in arrays, numbers as
-@cindex @code{CONVFMT} variable, array subscripts and
+@cindex @code{CONVFMT} variable, and array subscripts
An important aspect to remember about arrays is that @emph{array subscripts
are always strings}. When a numeric value is used as a subscript,
it is converted to a string value before being used for subscripting
@@ -14998,7 +15032,8 @@ string value from @code{xyz}---this time @code{"12.15"}---because the value of
@code{CONVFMT} only allows two significant digits. This test fails,
since @code{"12.15"} is different from @code{"12.153"}.
-@cindex converting, during subscripting
+@cindex converting integer array subscripts
+@cindex integer array indices
According to the rules for conversions
(@pxref{Conversion}), integer
values are always converted to strings as integers, no matter what the
@@ -15104,7 +15139,7 @@ languages, including @command{awk}) to refer to an element of a
two-dimensional array named @code{grid} is with
@code{grid[@var{x},@var{y}]}.
-@cindex @code{SUBSEP} variable, multidimensional arrays
+@cindex @code{SUBSEP} variable, and multidimensional arrays
Multidimensional arrays are supported in @command{awk} through
concatenation of indices into one string.
@command{awk} converts the indices into strings
@@ -15136,6 +15171,7 @@ combined strings that are ambiguous. Suppose that @code{SUBSEP} is
"b@@c"]}} are indistinguishable because both are actually
stored as @samp{foo["a@@b@@c"]}.
+@cindex @code{in} operator, index existence in multidimensional arrays
To test whether a particular index sequence exists in a
multidimensional array, use the same operator (@code{in}) that is
used for single dimensional arrays. Write the whole sequence of indices
@@ -15201,6 +15237,7 @@ multidimensional @emph{way of accessing} an array.
@cindex subscripts in arrays, multidimensional, scanning
@cindex arrays, multidimensional, scanning
+@cindex scanning multidimensional arrays
However, if your program has an array that is always accessed as
multidimensional, you can get the effect of scanning it by combining
the scanning @code{for} statement
@@ -15242,6 +15279,7 @@ separate indices is recovered.
@node Arrays of Arrays
@section Arrays of Arrays
+@cindex arrays of arrays
@command{gawk} goes beyond standard @command{awk}'s multidimensional
array access and provides true arrays of
@@ -15501,6 +15539,7 @@ two arguments 11 and 10.
@node Numeric Functions
@subsection Numeric Functions
+@cindex numeric functions
The following list describes all of
the built-in functions that work with numbers.
@@ -15509,21 +15548,25 @@ Optional parameters are enclosed in square brackets@w{ ([ ]):}
@table @code
@item atan2(@var{y}, @var{x})
@cindex @code{atan2()} function
+@cindex arctangent
Return the arctangent of @code{@var{y} / @var{x}} in radians.
You can use @samp{pi = atan2(0, -1)} to retrieve the value of @value{PI}.
@item cos(@var{x})
@cindex @code{cos()} function
+@cindex cosine
Return the cosine of @var{x}, with @var{x} in radians.
@item exp(@var{x})
@cindex @code{exp()} function
+@cindex exponent
Return the exponential of @var{x} (@code{e ^ @var{x}}) or report
an error if @var{x} is out of range. The range of values @var{x} can have
depends on your machine's floating-point representation.
@item int(@var{x})
@cindex @code{int()} function
+@cindex round to nearest integer
Return the nearest integer to @var{x}, located between @var{x} and zero and
truncated toward zero.
@@ -15532,6 +15575,7 @@ is @minus{}3, and @code{int(-3)} is @minus{}3 as well.
@item log(@var{x})
@cindex @code{log()} function
+@cindex logarithm
Return the natural logarithm of @var{x}, if @var{x} is positive;
otherwise, report an error.
@@ -15578,7 +15622,7 @@ function roll(n) @{ return 1 + int(rand() * n) @}
@}
@end example
-@cindex numbers, random
+@cindex seeding random number generator
@cindex random numbers, seed of
@quotation CAUTION
In most @command{awk} implementations, including @command{gawk},
@@ -15595,10 +15639,12 @@ use @code{srand()}.
@item sin(@var{x})
@cindex @code{sin()} function
+@cindex sine
Return the sine of @var{x}, with @var{x} in radians.
@item sqrt(@var{x})
@cindex @code{sqrt()} function
+@cindex square root
Return the positive square root of @var{x}.
@command{gawk} prints a warning message
if @var{x} is negative. Thus, @code{sqrt(4)} is 2.
@@ -15634,6 +15680,7 @@ sequences of random numbers.
@node String Functions
@subsection String-Manipulation Functions
+@cindex string-manipulation functions
The functions in this @value{SECTION} look at or change the text of one
or more strings.
@@ -15663,10 +15710,10 @@ pound sign@w{ (@samp{#}):}
@item asort(@var{source} @r{[}, @var{dest} @r{[}, @var{how} @r{]} @r{]}) #
@itemx asorti(@var{source} @r{[}, @var{dest} @r{[}, @var{how} @r{]} @r{]}) #
@cindex @code{asorti()} function (@command{gawk})
+@cindex sort array
@cindex arrays, elements, retrieving number of
@cindex @code{asort()} function (@command{gawk})
-@cindex @command{gawk}, @code{IGNORECASE} variable in
-@cindex @code{IGNORECASE} variable
+@cindex sort array indices
These two functions are similar in behavior, so they are described
together.
@@ -15684,7 +15731,9 @@ sequential integers starting with one. If the optional array @var{dest}
is specified, then @var{source} is duplicated into @var{dest}. @var{dest}
is then sorted, leaving the indices of @var{source} unchanged.
-When comparing strings, @code{IGNORECASE} affects the sorting. If the
+@cindex @command{gawk}, @code{IGNORECASE} variable in
+When comparing strings, @code{IGNORECASE} affects the sorting
+(@pxref{Array Sorting Functions}). If the
@var{source} array contains subarrays as values (@pxref{Arrays of
Arrays}), they will come last, after all scalar values.
@@ -15728,6 +15777,8 @@ are not available in compatibility mode (@pxref{Options}).
@item gensub(@var{regexp}, @var{replacement}, @var{how} @r{[}, @var{target}@r{]}) #
@cindex @code{gensub()} function (@command{gawk})
+@cindex search and replace in strings
+@cindex substitute in string
Search the target string @var{target} for matches of the regular
expression @var{regexp}. If @var{how} is a string beginning with
@samp{g} or @samp{G} (short for ``global''), then replace all matches of @var{regexp} with
@@ -15736,7 +15787,7 @@ which match of @var{regexp} to replace. If no @var{target} is supplied,
use @code{$0}. It returns the modified string as the result
of the function and the original target string is @emph{not} changed.
-@code{gensub()} is a general substitution function. It's purpose is
+@code{gensub()} is a general substitution function. Its purpose is
to provide more features than the standard @code{sub()} and @code{gsub()}
functions.
@@ -15813,7 +15864,8 @@ and the third argument must be assignable.
@item index(@var{in}, @var{find})
@cindex @code{index()} function
-@cindex searching
+@cindex search in string
+@cindex find substring in string
Search the string @var{in} for the first occurrence of the string
@var{find}, and return the position in characters where that occurrence
begins in the string @var{in}. Consider the following example:
@@ -15831,6 +15883,8 @@ It is a fatal error to use a regexp constant for @var{find}.
@item length(@r{[}@var{string}@r{]})
@cindex @code{length()} function
+@cindex string length
+@cindex length of string
Return the number of characters in @var{string}. If
@var{string} is a number, the length of the digit string representing
that number is returned. For example, @code{length("abcde")} is five. By
@@ -15838,6 +15892,8 @@ contrast, @code{length(15 * 35)} works out to three. In this example, 15 * 35 =
525, and 525 is then converted to the string @code{"525"}, which has
three characters.
+@cindex length of input record
+@cindex input record, length of
If no argument is supplied, @code{length()} returns the length of @code{$0}.
@c @cindex historical features
@@ -15876,6 +15932,8 @@ warning about this.
@cindex common extensions, @code{length()} applied to an array
@cindex extensions, common@comma{} @code{length()} applied to an array
@cindex differences between @command{gawk} and @command{awk}
+@cindex number of array elements
+@cindex array, number of elements
With @command{gawk} and several other @command{awk} implementations, when given an
array argument, the @code{length()} function returns the number of elements
in the array. @value{COMMONEXT}
@@ -15890,6 +15948,8 @@ If @option{--posix} is supplied, using an array argument is a fatal error
@item match(@var{string}, @var{regexp} @r{[}, @var{array}@r{]})
@cindex @code{match()} function
+@cindex string, regular expression match
+@cindex match regexp in string
Search @var{string} for the
longest, leftmost substring matched by the regular expression,
@var{regexp} and return the character position, or @dfn{index},
@@ -16005,6 +16065,7 @@ using a third argument is a fatal error.
@item patsplit(@var{string}, @var{array} @r{[}, @var{fieldpat} @r{[}, @var{seps} @r{]} @r{]}) #
@cindex @code{patsplit()} function (@command{gawk})
+@cindex split string into array
Divide
@var{string} into pieces defined by @var{fieldpat}
and store the pieces in @var{array} and the separator strings in the
@@ -16064,7 +16125,7 @@ split("cul-de-sac", a, "-", seps)
@end example
@noindent
-@cindex strings, splitting
+@cindex strings splitting, example
splits the string @samp{cul-de-sac} into three fields using @samp{-} as the
separator. It sets the contents of the array @code{a} as follows:
@@ -16121,6 +16182,7 @@ If @var{string} does not match @var{fieldsep} at all (but is not null),
@item sprintf(@var{format}, @var{expression1}, @dots{})
@cindex @code{sprintf()} function
+@cindex formatting strings
Return (without printing) the string that @code{printf} would
have printed out with the same arguments
(@pxref{Printf}).
@@ -16134,6 +16196,7 @@ pival = sprintf("pi = %.2f (approx.)", 22/7)
assigns the string @w{@samp{pi = 3.14 (approx.)}} to the variable @code{pival}.
@cindex @code{strtonum()} function (@command{gawk})
+@cindex convert string to number
@item strtonum(@var{str}) #
Examine @var{str} and return its numeric value. If @var{str}
begins with a leading @samp{0}, @code{strtonum()} assumes that @var{str}
@@ -16161,6 +16224,7 @@ in compatibility mode (@pxref{Options}).
@item sub(@var{regexp}, @var{replacement} @r{[}, @var{target}@r{]})
@cindex @code{sub()} function
+@cindex replace in string
Search @var{target}, which is treated as a string, for the
leftmost, longest substring matched by the regular expression @var{regexp}.
Modify the entire string
@@ -16261,6 +16325,7 @@ string, and then the value of that string is treated as the regexp to match.
@item substr(@var{string}, @var{start} @r{[}, @var{length}@r{]})
@cindex @code{substr()} function
+@cindex substring
Return a @var{length}-character-long substring of @var{string},
starting at character number @var{start}. The first character of a
string is character number one.@footnote{This is different from
@@ -16317,9 +16382,10 @@ string = substr(string, 1, 2) "CDE" substr(string, 6)
@end example
@cindex case sensitivity, converting case
-@cindex converting, case
+@cindex strings, converting letter case
@item tolower(@var{string})
@cindex @code{tolower()} function
+@cindex convert string to lower case
Return a copy of @var{string}, with each uppercase character
in the string replaced with its corresponding lowercase character.
Nonalphabetic characters are left unchanged. For example,
@@ -16327,6 +16393,7 @@ Nonalphabetic characters are left unchanged. For example,
@item toupper(@var{string})
@cindex @code{toupper()} function
+@cindex convert string to upper case
Return a copy of @var{string}, with each lowercase character
in the string replaced with its corresponding uppercase character.
Nonalphabetic characters are left unchanged. For example,
@@ -16752,6 +16819,7 @@ Although this makes a certain amount of sense, it can be surprising.
@node I/O Functions
@subsection Input/Output Functions
+@cindex input/output functions
The following functions relate to input/output (I/O).
Optional parameters are enclosed in square brackets ([ ]):
@@ -16760,6 +16828,7 @@ Optional parameters are enclosed in square brackets ([ ]):
@item close(@var{filename} @r{[}, @var{how}@r{]})
@cindex @code{close()} function
@cindex files, closing
+@cindex close file or coprocess
Close the file @var{filename} for input or output. Alternatively, the
argument may be a shell command that was used for creating a coprocess, or
for redirecting to or from a pipe; then the coprocess or pipe is closed.
@@ -16777,6 +16846,7 @@ which discusses this feature in more detail and gives an example.
@item fflush(@r{[}@var{filename}@r{]})
@cindex @code{fflush()} function
+@cindex flush buffered output
Flush any buffered output associated with @var{filename}, which is either a
file opened for writing or a shell command for redirecting output to
a pipe or coprocess.
@@ -16836,6 +16906,7 @@ In such a case, @code{fflush()} returns @minus{}1, as well.
@item system(@var{command})
@cindex @code{system()} function
+@cindex invoke shell command
@cindex interacting with other programs
Execute the operating-system
command @var{command} and then return to the @command{awk} program.
@@ -17110,6 +17181,7 @@ you would see the latter (undesirable) output.
@node Time Functions
@subsection Time Functions
+@cindex time functions
@c STARTOFRANGE tst
@cindex timestamps
@@ -17149,6 +17221,7 @@ Optional parameters are enclosed in square brackets ([ ]):
@table @code
@item mktime(@var{datespec})
@cindex @code{mktime()} function (@command{gawk})
+@cindex generate time values
Turn @var{datespec} into a timestamp in the same form
as is returned by @code{systime()}. It is similar to the function of the
same name in ISO C. The argument, @var{datespec}, is a string of the form
@@ -17179,6 +17252,7 @@ is out of range, @code{mktime()} returns @minus{}1.
@item strftime(@r{[}@var{format} @r{[}, @var{timestamp} @r{[}, @var{utc-flag}@r{]]]})
@c STARTOFRANGE strf
@cindex @code{strftime()} function (@command{gawk})
+@cindex format time string
Format the time specified by @var{timestamp}
based on the contents of the @var{format} string and return the result.
It is similar to the function of the same name in ISO C.
@@ -17200,6 +17274,7 @@ change the default format.
@item systime()
@cindex @code{systime()} function (@command{gawk})
@cindex timestamps
+@cindex current system time
Return the current time as the number of seconds since
the system epoch. On POSIX systems, this is the number of seconds
since 1970-01-01 00:00:00 UTC, not counting leap seconds.
@@ -17493,6 +17568,7 @@ gawk 'BEGIN @{
@node Bitwise Functions
@subsection Bit-Manipulation Functions
+@cindex bit-manipulation functions
@c STARTOFRANGE bit
@cindex bitwise, operations
@c STARTOFRANGE and
@@ -17656,26 +17732,32 @@ bitwise operations just described. They are:
@cindex @command{gawk}, bitwise operations in
@table @code
@cindex @code{and()} function (@command{gawk})
+@cindex bitwise AND
@item and(@var{v1}, @var{v2} @r{[}, @r{@dots{}]})
Return the bitwise AND of the arguments. There must be at least two.
@cindex @code{compl()} function (@command{gawk})
+@cindex bitwise complement
@item compl(@var{val})
Return the bitwise complement of @var{val}.
@cindex @code{lshift()} function (@command{gawk})
+@cindex left shift
@item lshift(@var{val}, @var{count})
Return the value of @var{val}, shifted left by @var{count} bits.
@cindex @code{or()} function (@command{gawk})
+@cindex bitwise OR
@item or(@var{v1}, @var{v2} @r{[}, @r{@dots{}]})
Return the bitwise OR of the arguments. There must be at least two.
@cindex @code{rshift()} function (@command{gawk})
+@cindex right shift
@item rshift(@var{val}, @var{count})
Return the value of @var{val}, shifted right by @var{count} bits.
@cindex @code{xor()} function (@command{gawk})
+@cindex bitwise XOR
@item xor(@var{v1}, @var{v2} @r{[}, @r{@dots{}]})
Return the bitwise XOR of the arguments. There must be at least two.
@end table
@@ -17767,6 +17849,7 @@ $ @kbd{gawk -f testbits.awk}
@cindex strings, converting
@cindex numbers, converting
@cindex converting, numbers to strings
+@cindex number as string of bits
The @code{bits2str()} function turns a binary number into a string.
The number @code{1} represents a binary value where the rightmost bit
is set to 1. Using this mask,
@@ -17803,6 +17886,7 @@ that traverses every element of a true multidimensional array
@table @code
@cindex @code{isarray()} function (@command{gawk})
+@cindex scalar or array
@item isarray(@var{x})
Return a true value if @var{x} is an array. Otherwise return false.
@end table
@@ -17824,6 +17908,7 @@ will end up turning it into a scalar.
@subsection String-Translation Functions
@cindex @command{gawk}, string-translation functions
@cindex functions, string-translation
+@cindex string-translation functions
@cindex internationalization
@cindex @command{awk} programs, internationalizing
@@ -17836,6 +17921,7 @@ Optional parameters are enclosed in square brackets ([ ]):
@table @code
@cindex @code{bindtextdomain()} function (@command{gawk})
+@cindex set directory of message catalogs
@item bindtextdomain(@var{directory} @r{[}, @var{domain}@r{]})
Set the directory in which
@command{gawk} will look for message translation files, in case they
@@ -17849,6 +17935,7 @@ If @var{directory} is the null string (@code{""}), then
given @var{domain}.
@cindex @code{dcgettext()} function (@command{gawk})
+@cindex translate string
@item dcgettext(@var{string} @r{[}, @var{domain} @r{[}, @var{category}@r{]]})
Return the translation of @var{string} in
text domain @var{domain} for locale category @var{category}.
@@ -17872,7 +17959,7 @@ The default value for @var{category} is @code{"LC_MESSAGES"}.
@section User-Defined Functions
@c STARTOFRANGE udfunc
-@cindex user-defined, functions
+@cindex user-defined functions
@c STARTOFRANGE funcud
@cindex functions, user-defined
Complicated @command{awk} programs can often be simplified by defining
@@ -17958,6 +18045,7 @@ conventional to place some extra space between the arguments and
the local variables, in order to document how your function is supposed to be used.
@cindex variables, shadowing
+@cindex shadowing of variable values
During execution of the function body, the arguments and local variable
values hide, or @dfn{shadow}, any variables of the same names used in the
rest of the program. The shadowed variables are not accessible in the
@@ -18016,6 +18104,7 @@ keyword @code{function} when defining a function.
@node Function Example
@subsection Function Definition Examples
+@cindex function definition example
Here is an example of a user-defined function, called @code{myprint()}, that
takes a number and prints it in a specific format:
@@ -18164,8 +18253,8 @@ an error.
@node Variable Scope
@subsubsection Controlling Variable Scope
-@cindex local variables
-@cindex variables, local
+@cindex local variables, in a function
+@cindex variables, local to a function
There is no way to make a variable local to a @code{@{ @dots{} @}} block in
@command{awk}, but you can make a variable local to a function. It is
good practice to do so whenever a variable is needed only in that
@@ -19094,7 +19183,7 @@ The leading capital letter indicates that it is global, while the fact that
the variable name is not all capital letters indicates that the variable is
not one of @command{awk}'s built-in variables, such as @code{FS}.
-@cindex @option{--dump-variables} option
+@cindex @option{--dump-variables} option, using for library functions
It is also important that @emph{all} variables in library
functions that do not need to save state are, in fact, declared
local.@footnote{@command{gawk}'s @option{--dump-variables} command-line
@@ -20911,7 +21000,7 @@ from anywhere within a user's program, and the user may have his
or her
own way of splitting records and fields.
-@cindex @code{PROCINFO} array
+@cindex @code{PROCINFO} array, testing the field splitting
The @code{using_fw} variable checks @code{PROCINFO["FS"]}, which
is @code{"FIELDWIDTHS"} if field splitting is being done with
@code{FIELDWIDTHS}. This makes it possible to restore the correct
@@ -21034,7 +21123,7 @@ uses these functions.
@cindex group database, reading
@c STARTOFRANGE datagr
@cindex database, group, reading
-@cindex @code{PROCINFO} array
+@cindex @code{PROCINFO} array, and group membership
@cindex @code{getgrent()} function (C library)
@cindex @code{getgrent()} user-defined function
@cindex groups@comma{} information about
@@ -22213,7 +22302,7 @@ $ @kbd{id}
@print{} uid=500(arnold) gid=500(arnold) groups=6(disk),7(lp),19(floppy)
@end example
-@cindex @code{PROCINFO} array
+@cindex @code{PROCINFO} array, and user and group ID numbers
This information is part of what is provided by @command{gawk}'s
@code{PROCINFO} array (@pxref{Built-in Variables}).
However, the @command{id} utility provides a more palatable output than just
@@ -22314,7 +22403,6 @@ BEGIN \
@c endfile
@end example
-@cindex @code{in} operator
The test in the @code{for} loop is worth noting.
Any supplementary groups in the @code{PROCINFO} array have the
indices @code{"group1"} through @code{"group@var{N}"} for some
@@ -22324,7 +22412,7 @@ there are.
This loop works by starting at one, concatenating the value with
@code{"group"}, and then using @code{in} to see if that value is
-in the array. Eventually, @code{i} is incremented past
+in the array (@pxref{Reference to Elements}). Eventually, @code{i} is incremented past
the last group in the array and the loop exits.
The loop is also correct if there are @emph{no} supplementary
@@ -25533,9 +25621,8 @@ both arrays use the values.
@c Document It And Call It A Feature. Sigh.
@cindex @command{gawk}, @code{IGNORECASE} variable in
-@cindex @code{IGNORECASE} variable
-@cindex arrays, sorting, @code{IGNORECASE} variable and
-@cindex @code{IGNORECASE} variable, array sorting and
+@cindex arrays, sorting, and @code{IGNORECASE} variable
+@cindex @code{IGNORECASE} variable, and array sorting functions
Because @code{IGNORECASE} affects string comparisons, the value
of @code{IGNORECASE} also affects sorting for both @code{asort()} and @code{asorti()}.
Note also that the locale's sorting order does @emph{not}
@@ -25704,7 +25791,7 @@ As a side note, the assignment @samp{LC_ALL=C} in the @command{sort}
command ensures traditional Unix (ASCII) sorting from @command{sort}.
@cindex @command{gawk}, @code{PROCINFO} array in
-@cindex @code{PROCINFO} array
+@cindex @code{PROCINFO} array, and communications via ptys
You may also use pseudo-ttys (ptys) for
two-way communication instead of pipes, if your system supports them.
This is done on a per-command basis, by setting a special element
@@ -25907,8 +25994,8 @@ Here is the @file{awkprof.out} that results from running the
illustrates that @command{awk} programmers sometimes get up very early
in the morning to work.)
-@cindex @code{BEGIN} pattern
-@cindex @code{END} pattern
+@cindex @code{BEGIN} pattern, and profiling
+@cindex @code{END} pattern, and profiling
@example
# gawk profile, created Thu Feb 27 05:16:21 2014
@@ -25972,7 +26059,7 @@ Multiple @code{BEGIN} and @code{END} rules retain their
separate identities, as do
multiple @code{BEGINFILE} and @code{ENDFILE} rules.
-@cindex patterns, counts
+@cindex patterns, counts, in a profile
@item
Pattern-action rules have two counts.
The first count, to the left of the rule, shows how many times
@@ -25992,7 +26079,7 @@ is a count showing how many times the condition was true.
The count for the @code{else}
indicates how many times the test failed.
-@cindex loops, count for header
+@cindex loops, count for header, in a profile
@item
The count for a loop header (such as @code{for}
or @code{while}) shows how many times the loop test was executed.
@@ -26000,8 +26087,8 @@ or @code{while}) shows how many times the loop test was executed.
statement in a rule to determine how many times the rule was executed.
If the first statement is a loop, the count is misleading.)
-@cindex functions, user-defined, counts
-@cindex user-defined, functions, counts
+@cindex functions, user-defined, counts, in a profile
+@cindex user-defined, functions, counts, in a profile
@item
For user-defined functions, the count next to the @code{function}
keyword indicates how many times the function was called.
@@ -26015,8 +26102,8 @@ The layout uses ``K&R'' style with TABs.
Braces are used everywhere, even when
the body of an @code{if}, @code{else}, or loop is only a single statement.
-@cindex @code{()} (parentheses)
-@cindex parentheses @code{()}
+@cindex @code{()} (parentheses), in a profile
+@cindex parentheses @code{()}, in a profile
@item
Parentheses are used only where needed, as indicated by the structure
of the program and the precedence rules.
@@ -26072,6 +26159,7 @@ which is correct, but possibly surprising.
@cindex profiling @command{awk} programs, dynamically
@cindex @command{gawk} program, dynamic profiling
+@cindex dynamic profiling
Besides creating profiles when a program has completed,
@command{gawk} can produce a profile while it is running.
This is useful if your @command{awk} program goes into an
@@ -26085,9 +26173,9 @@ $ @kbd{gawk --profile -f myprog &}
@end example
@cindex @command{kill} command@comma{} dynamic profiling
-@cindex @code{USR1} signal
-@cindex @code{SIGUSR1} signal
-@cindex signals, @code{USR1}/@code{SIGUSR1}
+@cindex @code{USR1} signal, for dynamic profiling
+@cindex @code{SIGUSR1} signal, for dynamic profiling
+@cindex signals, @code{USR1}/@code{SIGUSR1}, for profiling
@noindent
The shell prints a job number and process ID number; in this case, 13992.
Use the @command{kill} command to send the @code{USR1} signal
@@ -26118,9 +26206,9 @@ You may send @command{gawk} the @code{USR1} signal as many times as you like.
Each time, the profile and function call trace are appended to the output
profile file.
-@cindex @code{HUP} signal
-@cindex @code{SIGHUP} signal
-@cindex signals, @code{HUP}/@code{SIGHUP}
+@cindex @code{HUP} signal, for dynamic profiling
+@cindex @code{SIGHUP} signal, for dynamic profiling
+@cindex signals, @code{HUP}/@code{SIGHUP}, for profiling
If you use the @code{HUP} signal instead of the @code{USR1} signal,
@command{gawk} produces the profile and the function call trace and then exits.
@@ -27034,6 +27122,7 @@ The following list defines terms used throughout the rest of
this @value{CHAPTER}.
@table @dfn
+@cindex stack frame
@item Stack Frame
Programs generally call functions during the course of their execution.
One function can call another, or a function can call itself (recursion).
@@ -27055,6 +27144,7 @@ invoked. Commands that print the call stack print information about
each stack frame (as detailed later on).
@item Breakpoint
+@cindex breakpoint
During debugging, you often wish to let the program run until it
reaches a certain point, and then continue execution from there one
statement (or instruction) at a time. The way to do this is to set
@@ -27064,6 +27154,7 @@ take over control of the program's execution. You can add and remove
as many breakpoints as you like.
@item Watchpoint
+@cindex watchpoint
A watchpoint is similar to a breakpoint. The difference is that
breakpoints are oriented around the code: stop when a certain point in the
code is reached. A watchpoint, however, specifies that program execution
@@ -27095,6 +27186,7 @@ by the higher-level @command{awk} commands.
@node Sample Debugging Session
@section Sample Debugging Session
+@cindex sample debugging session
In order to illustrate the use of @command{gawk} as a debugger, let's look at a sample
debugging session. We will use the @command{awk} implementation of the
@@ -27108,6 +27200,8 @@ as our example.
@node Debugger Invocation
@subsection How to Start the Debugger
+@cindex starting the debugger
+@cindex debugger, how to start
Starting the debugger is almost exactly like running @command{gawk},
except you have to pass an additional option @option{--debug} or the
@@ -27448,6 +27542,8 @@ controlling breakpoints are:
@cindex debugger commands, @code{break}
@cindex @code{break} debugger command
@cindex @code{b} debugger command (alias for @code{break})
+@cindex set breakpoint
+@cindex breakpoint, setting
@item @code{break} [[@var{filename}@code{:}]@var{n} | @var{function}] [@code{"@var{expression}"}]
@itemx @code{b} [[@var{filename}@code{:}]@var{n} | @var{function}] [@code{"@var{expression}"}]
Without any argument, set a breakpoint at the next instruction
@@ -27478,6 +27574,8 @@ it continues executing the program.
@cindex debugger commands, @code{clear}
@cindex @code{clear} debugger command
+@cindex delete breakpoint at location
+@cindex breakpoint at location, how to delete
@item @code{clear} [[@var{filename}@code{:}]@var{n} | @var{function}]
Without any argument, delete any breakpoint at the next instruction
to be executed in the selected stack frame. If the program stops at
@@ -27498,6 +27596,7 @@ Delete breakpoint(s) set at entry to function @var{function}.
@cindex debugger commands, @code{condition}
@cindex @code{condition} debugger command
+@cindex breakpoint condition
@item @code{condition} @var{n} @code{"@var{expression}"}
Add a condition to existing breakpoint or watchpoint @var{n}. The
condition is an @command{awk} expression that the debugger evaluates
@@ -27511,6 +27610,8 @@ watchpoint is made unconditional.
@cindex debugger commands, @code{delete}
@cindex @code{delete} debugger command
@cindex @code{d} debugger command (alias for @code{delete})
+@cindex delete breakpoint by number
+@cindex breakpoint, delete by number
@item @code{delete} [@var{n1 n2} @dots{}] [@var{n}--@var{m}]
@itemx @code{d} [@var{n1 n2} @dots{}] [@var{n}--@var{m}]
Delete specified breakpoints or a range of breakpoints. Deletes
@@ -27518,6 +27619,8 @@ all defined breakpoints if no argument is supplied.
@cindex debugger commands, @code{disable}
@cindex @code{disable} debugger command
+@cindex disable breakpoint
+@cindex breakpoint, how to disable or enable
@item @code{disable} [@var{n1 n2} @dots{} | @var{n}--@var{m}]
Disable specified breakpoints or a range of breakpoints. Without
any argument, disables all breakpoints.
@@ -27526,6 +27629,7 @@ any argument, disables all breakpoints.
@cindex debugger commands, @code{enable}
@cindex @code{enable} debugger command
@cindex @code{e} debugger command (alias for @code{enable})
+@cindex enable breakpoint
@item @code{enable} [@code{del} | @code{once}] [@var{n1 n2} @dots{}] [@var{n}--@var{m}]
@itemx @code{e} [@code{del} | @code{once}] [@var{n1 n2} @dots{}] [@var{n}--@var{m}]
Enable specified breakpoints or a range of breakpoints. Without
@@ -27545,6 +27649,7 @@ the program stops at the breakpoint.
@cindex debugger commands, @code{ignore}
@cindex @code{ignore} debugger command
+@cindex ignore breakpoint
@item @code{ignore} @var{n} @var{count}
Ignore breakpoint number @var{n} the next @var{count} times it is
hit.
@@ -27553,6 +27658,7 @@ hit.
@cindex debugger commands, @code{tbreak}
@cindex @code{tbreak} debugger command
@cindex @code{t} debugger command (alias for @code{tbreak})
+@cindex temporary breakpoint
@item @code{tbreak} [[@var{filename}@code{:}]@var{n} | @var{function}]
@itemx @code{t} [[@var{filename}@code{:}]@var{n} | @var{function}]
Set a temporary breakpoint (enabled for only one stop).
@@ -27573,6 +27679,8 @@ execution of the program than we saw in our earlier example:
@cindex @code{silent} debugger command
@cindex debugger commands, @code{end}
@cindex @code{end} debugger command
+@cindex breakpoint commands
+@cindex commands to execute at breakpoint
@item @code{commands} [@var{n}]
@itemx @code{silent}
@itemx @dots{}
@@ -27600,6 +27708,7 @@ gawk>
@cindex debugger commands, @code{c} (@code{continue})
@cindex debugger commands, @code{continue}
+@cindex continue program, in debugger
@item @code{continue} [@var{count}]
@itemx @code{c} [@var{count}]
Resume program execution. If continued from a breakpoint and @var{count} is
@@ -27616,6 +27725,7 @@ Print the returned value.
@cindex debugger commands, @code{next}
@cindex @code{next} debugger command
@cindex @code{n} debugger command (alias for @code{next})
+@cindex single-step execution, in the debugger
@item @code{next} [@var{count}]
@itemx @code{n} [@var{count}]
Continue execution to the next source line, stepping over function calls.
@@ -27710,6 +27820,7 @@ items on the list.
@cindex debugger commands, @code{eval}
@cindex @code{eval} debugger command
+@cindex evaluate expressions, in debugger
@item @code{eval "@var{awk statements}"}
Evaluate @var{awk statements} in the context of the running program.
You can do anything that an @command{awk} program would do: assign
@@ -27727,6 +27838,7 @@ parameters defined by the program.
@cindex debugger commands, @code{print}
@cindex @code{print} debugger command
@cindex @code{p} debugger command (alias for @code{print})
+@cindex print variables, in debugger
@item @code{print} @var{var1}[@code{,} @var{var2} @dots{}]
@itemx @code{p} @var{var1}[@code{,} @var{var2} @dots{}]
Print the value of a @command{gawk} variable or field.
@@ -27760,6 +27872,7 @@ No newline is printed unless one is specified.
@cindex debugger commands, @code{set}
@cindex @code{set} debugger command
+@cindex assign values to variables, in debugger
@item @code{set} @var{var}@code{=}@var{value}
Assign a constant (number or string) value to an @command{awk} variable
or field.
@@ -27772,6 +27885,7 @@ You can also set special @command{awk} variables, such as @code{FS},
@cindex debugger commands, @code{watch}
@cindex @code{watch} debugger command
@cindex @code{w} debugger command (alias for @code{watch})
+@cindex set watchpoint
@item @code{watch} @var{var} | @code{$}@var{n} [@code{"@var{expression}"}]
@itemx @code{w} @var{var} | @code{$}@var{n} [@code{"@var{expression}"}]
Add variable @var{var} (or field @code{$@var{n}}) to the watch list.
@@ -27788,12 +27902,14 @@ then the debugger stops execution and prompts for a command. Otherwise,
@cindex debugger commands, @code{undisplay}
@cindex @code{undisplay} debugger command
+@cindex stop automatic display, in debugger
@item @code{undisplay} [@var{n}]
Remove item number @var{n} (or all items, if no argument) from the
automatic display list.
@cindex debugger commands, @code{unwatch}
@cindex @code{unwatch} debugger command
+@cindex delete watchpoint
@item @code{unwatch} [@var{n}]
Remove item number @var{n} (or all items, if no argument) from the
watch list.
@@ -27814,6 +27930,8 @@ functions which called the one you are in. The commands for doing this are:
@cindex debugger commands, @code{backtrace}
@cindex @code{backtrace} debugger command
@cindex @code{bt} debugger command (alias for @code{backtrace})
+@cindex call stack, display in debugger
+@cindex traceback, display in debugger
@item @code{backtrace} [@var{count}]
@itemx @code{bt} [@var{count}]
Print a backtrace of all function calls (stack frames), or innermost @var{count}
@@ -27867,25 +27985,32 @@ The value for @var{what} should be one of the following:
@c nested table
@table @code
@item args
+@cindex show function arguments, in debugger
Arguments of the selected frame.
@item break
+@cindex show breakpoints
List all currently set breakpoints.
@item display
+@cindex automatic displays, in debugger
List all items in the automatic display list.
@item frame
+@cindex describe call stack frame, in debugger
Description of the selected stack frame.
@item functions
+@cindex list function definitions, in debugger
List all function definitions including source file names and
line numbers.
@item locals
+@cindex show local variables, in debugger
Local variables of the selected frame.
@item source
+@cindex show name of current source file, in debugger
The name of the current source file. Each time the program stops, the
current source file is the file containing the current instruction.
When the debugger first starts, the current source file is the first file
@@ -27894,12 +28019,15 @@ included via the @option{-f} option. The
be used at any time to change the current source.
@item sources
+@cindex show all source files, in debugger
List all program sources.
@item variables
+@cindex list all global variables, in debugger
List all global variables.
@item watch
+@cindex show watchpoints
List all items in the watch list.
@end table
@end table
@@ -27913,6 +28041,8 @@ from a file. The commands are:
@cindex debugger commands, @code{option}
@cindex @code{option} debugger command
@cindex @code{o} debugger command (alias for @code{option})
+@cindex display debugger options
+@cindex debugger options
@item @code{option} [@var{name}[@code{=}@var{value}]]
@itemx @code{o} [@var{name}[@code{=}@var{value}]]
Without an argument, display the available debugger options
@@ -27924,6 +28054,7 @@ The available options are:
@c nested table
@table @code
@item history_size
+@cindex debugger history size
The maximum number of lines to keep in the history file @file{./.gawk_history}.
The default is 100.
@@ -27931,23 +28062,28 @@ The default is 100.
The number of lines that @code{list} prints. The default is 15.
@item outfile
+@cindex redirect @command{gawk} output, in debugger
Send @command{gawk} output to a file; debugger output still goes
to standard output. An empty string (@code{""}) resets output to
standard output.
@item prompt
+@cindex debugger prompt
The debugger prompt. The default is @samp{@w{gawk> }}.
@item save_history @r{[}on @r{|} off@r{]}
+@cindex debugger history file
Save command history to file @file{./.gawk_history}.
The default is @code{on}.
@item save_options @r{[}on @r{|} off@r{]}
+@cindex save debugger options
Save current options to file @file{./.gawkrc} upon exit.
The default is @code{on}.
Options are read back in to the next session upon startup.
@item trace @r{[}on @r{|} off@r{]}
+@cindex instruction tracing, in debugger
Turn instruction tracing on or off. The default is @code{off}.
@end table
@@ -27956,6 +28092,7 @@ Save the commands from the current session to the given file name,
so that they can be replayed using the @command{source} command.
@item @code{source} @var{filename}
+@cindex debugger, read commands from a file
Run command(s) from a file; an error in any command does not
terminate execution of subsequent commands. Comments (lines starting
with @samp{#}) are allowed in a command file.
@@ -28088,6 +28225,7 @@ function @var{function}. This command may change the current source file.
@cindex debugger commands, @code{quit}
@cindex @code{quit} debugger command
@cindex @code{q} debugger command (alias for @code{quit})
+@cindex exit the debugger
@item @code{quit}
@itemx @code{q}
Exit the debugger. Debugging is great fun, but sometimes we all have
@@ -28111,6 +28249,8 @@ fairly self-explanatory, and using @code{stepi} and @code{nexti} while
@node Readline Support
@section Readline Support
+@cindex command completion, in debugger
+@cindex history expansion, in debugger
If @command{gawk} is compiled with the @code{readline} library, you
can take advantage of that library's command completion and history expansion
@@ -28199,8 +28339,6 @@ be added, and of course feel free to try to add them yourself!
@cindex multiple precision
@cindex infinite precision
@cindex floating-point, numbers@comma{} arbitrary precision
-@cindex MPFR
-@cindex GMP
@cindex Knuth, Donald
@quotation
@@ -28964,6 +29102,8 @@ when you change the rounding mode.
@node Gawk and MPFR
@section @command{gawk} + MPFR = Powerful Arithmetic
+@cindex MPFR
+@cindex GMP
The rest of this @value{CHAPTER} describes how to use the arbitrary precision
(also known as @dfn{multiple precision} or @dfn{infinite precision}) numeric
@@ -29068,6 +29208,7 @@ your program.
@node Setting Precision
@subsection Setting the Working Precision
@cindex @code{PREC} variable
+@cindex setting working precision
@command{gawk} uses a global working precision; it does not keep track of
the precision or accuracy of individual numbers. Performing an arithmetic
@@ -29143,6 +29284,7 @@ issues that occur because numbers are stored internally in binary.
@node Setting Rounding Mode
@subsection Setting the Rounding Mode
@cindex @code{ROUNDMODE} variable
+@cindex setting rounding mode
The @code{ROUNDMODE} variable provides
program level control over the rounding mode.
@@ -29210,6 +29352,7 @@ In the first case, the number is stored with the default precision of 53 bits.
@node Changing Precision
@subsection Changing the Precision of a Number
+@cindex changing precision of a number
@cindex Laurie, Dirk
@quotation
@@ -29328,6 +29471,7 @@ the problem at hand is often the correct approach in such situations.
@node Arbitrary Precision Integers
@section Arbitrary Precision Integer Arithmetic with @command{gawk}
@cindex integers, arbitrary precision
+@cindex arbitrary precision integers
If one of the options @option{--bignum} or @option{-M} is specified,
@command{gawk} performs all
@@ -29424,6 +29568,7 @@ gawk -M 'BEGIN @{ n = 13; print n % 2 @}'
@node Dynamic Extensions
@chapter Writing Extensions for @command{gawk}
+@cindex dynamically loaded extensions
It is possible to add new functions written in C or C++ to @command{gawk} using
dynamically loaded libraries. This facility is available on systems
@@ -29458,6 +29603,7 @@ When @option{--sandbox} is specified, extensions are disabled
@node Extension Intro
@section Introduction
+@cindex plug-in
An @dfn{extension} (sometimes called a @dfn{plug-in}) is a piece of
external compiled code that @command{gawk} can load at runtime to
provide additional functionality, over and above the built-in capabilities
@@ -29586,6 +29732,7 @@ happen, but we all know how @emph{that} goes.)
@node Extension API Description
@section API Description
+@cindex extension API
This (rather large) @value{SECTION} describes the API in detail.
@@ -29970,6 +30117,8 @@ value type, as appropriate. This behavior is summarized in
@node Memory Allocation Functions
@subsection Memory Allocation Functions and Convenience Macros
+@cindex allocating memory for extensions
+@cindex extensions, allocating memory
The API provides a number of @dfn{memory allocation} functions for
allocating memory that can be passed to @command{gawk}, as well as a number of
@@ -30084,6 +30233,8 @@ pointed to by @code{result}.
@node Registration Functions
@subsection Registration Functions
+@cindex register extension
+@cindex extension registration
This @value{SECTION} describes the API functions for
registering parts of your extension with @command{gawk}.
@@ -30205,6 +30356,7 @@ is invoked with the @option{--version} option.
@node Input Parsers
@subsubsection Customized Input Parsers
+@cindex customized input parser
By default, @command{gawk} reads text files as its input. It uses the value
of @code{RS} to find the end of the record, and then uses @code{FS}
@@ -30452,7 +30604,9 @@ Register the input parser pointed to by @code{input_parser} with
@node Output Wrappers
@subsubsection Customized Output Wrappers
+@cindex customized output wrapper
+@cindex output wrapper
An @dfn{output wrapper} is the mirror image of an input parser.
It allows an extension to take over the output to a file opened
with the @samp{>} or @samp{>>} I/O redirection operators (@pxref{Redirection}).
@@ -30566,6 +30720,7 @@ Register the output wrapper pointed to by @code{output_wrapper} with
@node Two-way processors
@subsubsection Customized Two-way Processors
+@cindex customized two-way processor
A @dfn{two-way processor} combines an input parser and an output wrapper for
two-way I/O with the @samp{|&} operator (@pxref{Redirection}). It makes identical
@@ -30623,6 +30778,8 @@ Register the two-way processor pointed to by @code{two_way_processor} with
@node Printing Messages
@subsection Printing Messages
+@cindex printing messages from extensions
+@cindex messages from extensions
You can print different kinds of warning messages from your
extension, as described below. Note that for these functions,
@@ -30696,6 +30853,7 @@ for more information on creating arrays.
@node Symbol Table Access
@subsection Symbol Table Access
+@cindex accessing global variables from extensions
Two sets of routines provide access to global variables, and one set
allows you to create and release cached values.
@@ -30953,6 +31111,7 @@ you should release any cached values that you created, using
@node Array Manipulation
@subsection Array Manipulation
+@cindex array manipulation in extensions
The primary data structure@footnote{Okay, the only data structure.} in @command{awk}
is the associative array (@pxref{Arrays}).
@@ -31532,6 +31691,8 @@ information about how @command{gawk} was invoked.
@node Extension Versioning
@subsubsection API Version Constants and Variables
+@cindex API version
+@cindex extension API version
The API provides both a ``major'' and a ``minor'' version number.
The API versions are available at compile time as constants:
@@ -31585,6 +31746,8 @@ provided in @file{gawkapi.h} (discussed later, in
@node Extension API Informational Variables
@subsubsection Informational Variables
+@cindex API informational variables
+@cindex extension API informational variables
The API provides access to several variables that describe
whether the corresponding command-line options were enabled when
@@ -31730,6 +31893,8 @@ the version string with @command{gawk}.
@node Finding Extensions
@section How @command{gawk} Finds Extensions
+@cindex extension search path
+@cindex finding extensions
Compiled extensions have to be installed in a directory where
@command{gawk} can find them. If @command{gawk} is configured and
@@ -31740,6 +31905,7 @@ path with a list of directories to search for compiled extensions.
@node Extension Example
@section Example: Some File Functions
+@cindex extension example
@quotation
@i{No matter where you go, there you are.}
@@ -32384,6 +32550,7 @@ $ @kbd{AWKLIBPATH=$PWD gawk -f testff.awk}
@node Extension Samples
@section The Sample Extensions In The @command{gawk} Distribution
+@cindex extensions distributed with @command{gawk}
This @value{SECTION} provides brief overviews of the sample extensions
that come in the @command{gawk} distribution. Some of them are intended
@@ -33044,6 +33211,8 @@ tries to use @code{nanosleep()} or @code{select()} to implement the delay.
@node gawkextlib
@section The @code{gawkextlib} Project
+@cindex @code{gawkextlib}
+@cindex extensions, where to find
@cindex @code{gawkextlib} project
The @uref{http://sourceforge.net/projects/gawkextlib/, @code{gawkextlib}}