diff options
Diffstat (limited to 'doc/gawk.info')
-rw-r--r-- | doc/gawk.info | 4247 |
1 files changed, 2267 insertions, 1980 deletions
diff --git a/doc/gawk.info b/doc/gawk.info index fe51de53..90ae5848 100644 --- a/doc/gawk.info +++ b/doc/gawk.info @@ -246,6 +246,7 @@ entitled "GNU Free Documentation License". * Special Caveats:: Things to watch out for. * Close Files And Pipes:: Closing Input and Output Files and Pipes. +* Nonfatal:: Enabling Nonfatal Output. * Output Summary:: Output summary. * Output Exercises:: Exercises. * Values:: Constants, Variables, and Regular @@ -954,7 +955,7 @@ provided in *note Language History::. The language described in this Info file is often referred to as "new `awk'." By analogy, the original version of `awk' is referred to as "old `awk'." - Today, on most systems, when you run the `awk' utility you get some + On most current systems, when you run the `awk' utility you get some version of new `awk'.(1) If your system's standard `awk' is the old one, you will see something like this if you try the test program: @@ -1375,6 +1376,12 @@ also must acknowledge my gratitude to G-d, for the many opportunities He has sent my way, as well as for the gifts He has given me with which to take advantage of those opportunities. + +Arnold Robbins +Nof Ayalon +Israel +February 2015 + File: gawk.info, Node: Getting Started, Next: Invoking Gawk, Prev: Preface, Up: Top @@ -1868,7 +1875,7 @@ file surrounded by double quotes: File: gawk.info, Node: Sample Data Files, Next: Very Simple, Prev: Running gawk, Up: Getting Started -1.2 Data Files for the Examples +1.2 Data files for the Examples =============================== Many of the examples in this Info file take their input from two sample @@ -2278,9 +2285,10 @@ built-in functions for working with timestamps, performing bit manipulation, for runtime string translation (internationalization), determining the type of a variable, and array sorting. - As we develop our presentation of the `awk' language, we introduce -most of the variables and many of the functions. They are described -systematically in *note Built-in Variables::, and in *note Built-in::. + As we develop our presentation of the `awk' language, we will +introduce most of the variables and many of the functions. They are +described systematically in *note Built-in Variables::, and in *note +Built-in::. File: gawk.info, Node: When, Next: Intro Summary, Prev: Other Features, Up: Getting Started @@ -2345,7 +2353,7 @@ File: gawk.info, Node: Intro Summary, Prev: When, Up: Getting Started * You may use backslash continuation to continue a source line. Lines are automatically continued after a comma, open brace, - question mark, colon, `||', `&&', `do' and `else'. + question mark, colon, `||', `&&', `do', and `else'. File: gawk.info, Node: Invoking Gawk, Next: Regexp, Prev: Getting Started, Up: Top @@ -2412,8 +2420,8 @@ File: gawk.info, Node: Options, Next: Other Arguments, Prev: Command Line, U Options begin with a dash and consist of a single character. GNU-style long options consist of two dashes and a keyword. The keyword can be abbreviated, as long as the abbreviation allows the option to be -uniquely identified. If the option takes an argument, then the keyword -is either immediately followed by an equals sign (`=') and the +uniquely identified. If the option takes an argument, either the +keyword is immediately followed by an equals sign (`=') and the argument's value, or the keyword and the argument's value are separated by whitespace. If a particular option with a value is given more than once, it is the last value that counts. @@ -2428,10 +2436,10 @@ The following list describes options mandated by the POSIX standard: `-f SOURCE-FILE' `--file SOURCE-FILE' - Read `awk' program source from SOURCE-FILE instead of in the first - nonoption argument. This option may be given multiple times; the - `awk' program consists of the concatenation of the contents of - each specified SOURCE-FILE. + Read the `awk' program source from SOURCE-FILE instead of in the + first nonoption argument. This option may be given multiple + times; the `awk' program consists of the concatenation of the + contents of each specified SOURCE-FILE. `-v VAR=VAL' `--assign VAR=VAL' @@ -2472,7 +2480,7 @@ The following list describes options mandated by the POSIX standard: `-b' `--characters-as-bytes' Cause `gawk' to treat all input data as single-byte characters. - In addition, all output written with `print' or `printf' are + In addition, all output written with `print' or `printf' is treated as single-byte characters. Normally, `gawk' follows the POSIX standard and attempts to process @@ -2480,7 +2488,7 @@ The following list describes options mandated by the POSIX standard: This can often involve converting multibyte characters into wide characters (internally), and can lead to problems or confusion if the input data does not contain valid multibyte characters. This - option is an easy way to tell `gawk': "hands off my data!". + option is an easy way to tell `gawk', "Hands off my data!" `-c' `--traditional' @@ -2515,7 +2523,7 @@ The following list describes options mandated by the POSIX standard: default, the debugger reads commands interactively from the keyboard (standard input). The optional FILE argument allows you to specify a file with a list of commands for the debugger to - execute non-interactively. No space is allowed between the `-D' + execute noninteractively. No space is allowed between the `-D' and FILE, if FILE is supplied. `-e' PROGRAM-TEXT @@ -2550,23 +2558,23 @@ The following list describes options mandated by the POSIX standard: `-g' `--gen-pot' - Analyze the source program and generate a GNU `gettext' Portable - Object Template file on standard output for all string constants + Analyze the source program and generate a GNU `gettext' portable + object template file on standard output for all string constants that have been marked for translation. *Note Internationalization::, for information about this option. `-h' `--help' - Print a "usage" message summarizing the short and long style + Print a "usage" message summarizing the short- and long-style options that `gawk' accepts and then exit. `-i' SOURCE-FILE `--include' SOURCE-FILE Read an `awk' source library from SOURCE-FILE. This option is completely equivalent to using the `@include' directive inside - your program. This option is very similar to the `-f' option, but - there are two important differences. First, when `-i' is used, - the program source is not loaded if it has been previously loaded, + your program. It is very similar to the `-f' option, but there + are two important differences. First, when `-i' is used, the + program source is not loaded if it has been previously loaded, whereas with `-f', `gawk' always loads the file. Second, because this option is intended to be used with code libraries, `gawk' does not recognize such files as constituting main program input. @@ -2628,7 +2636,7 @@ The following list describes options mandated by the POSIX standard: `-o'[FILE] `--pretty-print'[`='FILE] - Enable pretty-printing of `awk' programs. By default, output + Enable pretty-printing of `awk' programs. By default, the output program is created in a file named `awkprof.out' (*note Profiling::). The optional FILE argument allows you to specify a different file name for the output. No space is allowed between @@ -2734,7 +2742,7 @@ input as a source of data.) Because it is clumsy using the standard `awk' mechanisms to mix source file and command-line `awk' programs, `gawk' provides the `-e' -option. This does not require you to pre-empt the standard input for +option. This does not require you to preempt the standard input for your source code; it allows you to easily mix command-line and library source code (*note AWKPATH Variable::). As with `-f', the `-e' and `-i' options may also be used multiple times on the command line. @@ -2893,7 +2901,7 @@ implementations, you must supply a precise pathname for each program file, unless the file is in the current directory. But with `gawk', if the file name supplied to the `-f' or `-i' options does not contain a directory separator `/', then `gawk' searches a list of directories -(called the "search path"), one by one, looking for a file with the +(called the "search path") one by one, looking for a file with the specified name. The search path is a string consisting of directory names separated by @@ -2926,9 +2934,9 @@ or by placing two colons next to each other [`::'].) Different past versions of `gawk' would also look explicitly in the current directory, either before or after the path search. As - of version 4.1.2, this no longer happens, and if you wish to look - in the current directory, you must include `.' either as a separate - entry, or as a null entry in the search path. + of version 4.1.2, this no longer happens; if you wish to look in + the current directory, you must include `.' either as a separate + entry or as a null entry in the search path. The default value for `AWKPATH' is `.:/usr/local/share/awk'.(2) Since `.' is included at the beginning, `gawk' searches first in the @@ -3006,7 +3014,8 @@ used by regular users: `GAWK_SOCK_RETRIES' Controls the number of times `gawk' attempts to retry a two-way TCP/IP (socket) connection before giving up. *Note TCP/IP - Networking::. + Networking::. Note that when nonfatal I/O is enabled (*note + Nonfatal::), `gawk' only tries to open a TCP/IP socket once. `POSIXLY_CORRECT' Causes `gawk' to switch to POSIX-compatibility mode, disabling all @@ -3040,7 +3049,7 @@ change. The variables are: If this variable exists, `gawk' includes the file name and line number within the `gawk' source code from which warning and/or fatal messages are generated. Its purpose is to help isolate the - source of a message, as there are multiple places which produce the + source of a message, as there are multiple places that produce the same warning or error message. `GAWK_NO_DFA' @@ -3056,16 +3065,16 @@ change. The variables are: evaluation stack, when needed. `INT_CHAIN_MAX' - The intended maximum number of items `gawk' will maintain on a - hash chain for managing arrays indexed by integers. + This specifies intended maximum number of items `gawk' will + maintain on a hash chain for managing arrays indexed by integers. `STR_CHAIN_MAX' - The intended maximum number of items `gawk' will maintain on a - hash chain for managing arrays indexed by strings. + This specifies intended maximum number of items `gawk' will + maintain on a hash chain for managing arrays indexed by strings. `TIDYMEM' If this variable exists, `gawk' uses the `mtrace()' library calls - from GNU LIBC to help track down possible memory leaks. + from the GNU C library to help track down possible memory leaks. File: gawk.info, Node: Exit Status, Next: Include Files, Prev: Environment Variables, Up: Invoking Gawk @@ -3097,11 +3106,11 @@ This minor node describes a feature that is specific to `gawk'. files. This gives you the ability to split large `awk' source files into smaller, more manageable pieces, and also lets you reuse common `awk' code from various `awk' scripts. In other words, you can group -together `awk' functions, used to carry out specific tasks, into -external files. These files can be used just like function libraries, -using the `@include' keyword in conjunction with the `AWKPATH' -environment variable. Note that source files may also be included -using the `-i' option. +together `awk' functions used to carry out specific tasks into external +files. These files can be used just like function libraries, using the +`@include' keyword in conjunction with the `AWKPATH' environment +variable. Note that source files may also be included using the `-i' +option. Let's see an example. We'll start with two (trivial) `awk' scripts, namely `test1' and `test2'. Here is the `test1' script: @@ -3163,11 +3172,11 @@ Variable::) apply to `@include' also. This is very helpful in constructing `gawk' function libraries. If you have a large script with useful, general-purpose `awk' functions, you can break it down into library files and put those files in a -special directory. You can then include those "libraries," using -either the full pathnames of the files, or by setting the `AWKPATH' +special directory. You can then include those "libraries," either by +using the full pathnames of the files, or by setting the `AWKPATH' environment variable accordingly and then using `@include' with just -the file part of the full pathname. Of course, you can have more than -one directory to keep library files; the more complex the working +the file part of the full pathname. Of course, you can keep library +files in more than one directory; the more complex the working environment is, the more directories you may need to organize the files to be included. @@ -3179,8 +3188,8 @@ particular, `@include' is very useful for writing CGI scripts to be run from web pages. As mentioned in *note AWKPATH Variable::, the current directory is -always searched first for source files, before searching in `AWKPATH', -and this also applies to files named with `@include'. +always searched first for source files, before searching in `AWKPATH'; +this also applies to files named with `@include'. File: gawk.info, Node: Loading Shared Libraries, Next: Obsolete, Prev: Include Files, Up: Invoking Gawk @@ -3225,8 +3234,8 @@ File: gawk.info, Node: Obsolete, Next: Undocumented, Prev: Loading Shared Lib ==================================== This minor node describes features and/or command-line options from -previous releases of `gawk' that are either not available in the -current version or that are still supported but deprecated (meaning that +previous releases of `gawk' that either are not available in the +current version or are still supported but deprecated (meaning that they will _not_ be in the next release). The process-related special files `/dev/pid', `/dev/ppid', @@ -3254,7 +3263,7 @@ File: gawk.info, Node: Invoking Summary, Prev: Undocumented, Up: Invoking Gaw run `awk'. * The three standard options for all versions of `awk' are `-f', - `-F' and `-v'. `gawk' supplies these and many others, as well as + `-F', and `-v'. `gawk' supplies these and many others, as well as corresponding GNU-style long options. * Nonoption command-line arguments are usually treated as file names, @@ -3284,7 +3293,7 @@ File: gawk.info, Node: Invoking Summary, Prev: Undocumented, Up: Invoking Gaw * `gawk' allows you to load additional functions written in C or C++ using the `@load' statement and/or the `-l' option. (This - advanced feature is described later on in *note Dynamic + advanced feature is described later, in *note Dynamic Extensions::.) @@ -3433,7 +3442,7 @@ sequences apply to both string constants and regexp constants: Horizontal TAB, `Ctrl-i', ASCII code 9 (HT). `\v' - Vertical tab, `Ctrl-k', ASCII code 11 (VT). + Vertical TAB, `Ctrl-k', ASCII code 11 (VT). `\NNN' The octal value NNN, where NNN stands for 1 to 3 digits between @@ -3452,8 +3461,8 @@ sequences apply to both string constants and regexp constants: would continue incorporating hexadecimal digits into the value until a non-hexadecimal digit or the end of the string was encountered. However, using more than two hexadecimal - digits produced undefined results. As of version *FIXME:* - 4.3.0, only two digits are processed. + digits produced undefined results. As of version 4.2, only + two digits are processed. `\/' A literal slash (necessary for regexp constants only). This @@ -3483,7 +3492,7 @@ normally be a regexp operator. For example, `/a\+b/' matches the three characters `a+b'. For complete portability, do not use a backslash before any -character not shown in the previous list and that is not an operator. +character not shown in the previous list or that is not an operator. Backslash Before Regular Characters @@ -3545,7 +3554,7 @@ and converted into corresponding real characters as the very first step in processing regexps. Here is a list of metacharacters. All characters that are not escape -sequences and that are not listed in the following stand for themselves: +sequences and that are not listed here stand for themselves: `\' This suppresses the special meaning of a character when matching. @@ -3628,7 +3637,7 @@ sequences and that are not listed in the following stand for themselves: There are two subtle points to understand about how `*' works. First, the `*' applies only to the single preceding regular expression component (e.g., in `ph*', it applies just to the `h'). - To cause `*' to apply to a larger sub-expression, use parentheses: + To cause `*' to apply to a larger subexpression, use parentheses: `(ph)*' matches `ph', `phph', `phphph', and so on. Second, `*' finds as many repetitions as possible. If the text to @@ -3659,10 +3668,10 @@ sequences and that are not listed in the following stand for themselves: Matches `whhhy', but not `why' or `whhhhy'. `wh{3,5}y' - Matches `whhhy', `whhhhy', or `whhhhhy', only. + Matches `whhhy', `whhhhy', or `whhhhhy' only. `wh{2,}y' - Matches `whhy' or `whhhy', and so on. + Matches `whhy', `whhhy', and so on. Interval expressions were not traditionally available in `awk'. They were added as part of the POSIX standard to make `awk' and @@ -3764,7 +3773,7 @@ Class Meaning `[:print:]' Printable characters (characters that are not control characters) `[:punct:]' Punctuation characters (characters that are not letters, - digits control characters, or space characters) + digits, control characters, or space characters) `[:space:]' Space characters (such as space, TAB, and formfeed, to name a few) `[:upper:]' Uppercase alphabetic characters @@ -3802,8 +3811,9 @@ Collating symbols Equivalence classes Locale-specific names for a list of characters that are equal. The name is enclosed between `[=' and `=]'. For example, the name `e' - might be used to represent all of "e," "e`," and "e'." In this - case, `[[=e=]]' is a regexp that matches any of `e', `e'', or `e`'. + might be used to represent all of "e," "e^," "e`," and "e'." In + this case, `[[=e=]]' is a regexp that matches any of `e', `e^', + `e'', or `e`'. These features are very valuable in non-English-speaking locales. @@ -3825,7 +3835,7 @@ Consider the following: This example uses the `sub()' function to make a change to the input record. (`sub()' replaces the first instance of any text matched by the first argument with the string provided as the second argument; -*note String Functions::). Here, the regexp `/a+/' indicates "one or +*note String Functions::.) Here, the regexp `/a+/' indicates "one or more `a' characters," and the replacement text is `<A>'. The input contains four `a' characters. `awk' (and POSIX) regular @@ -3862,15 +3872,16 @@ regexp": This sets `digits_regexp' to a regexp that describes one or more digits, and tests whether the input record matches this regexp. - NOTE: When using the `~' and `!~' operators, there is a difference - between a regexp constant enclosed in slashes and a string - constant enclosed in double quotes. If you are going to use a - string constant, you have to understand that the string is, in - essence, scanned _twice_: the first time when `awk' reads your + NOTE: When using the `~' and `!~' operators, be aware that there + is a difference between a regexp constant enclosed in slashes and + a string constant enclosed in double quotes. If you are going to + use a string constant, you have to understand that the string is, + in essence, scanned _twice_: the first time when `awk' reads your program, and the second time when it goes to match the string on the lefthand side of the operator with the pattern on the right. This is true of any string-valued expression (such as - `digits_regexp', shown previously), not just string constants. + `digits_regexp', shown in the previous example), not just string + constants. What difference does it make if the string is scanned twice? The answer has to do with escape sequences, and particularly with @@ -3967,7 +3978,7 @@ letters, digits, or underscores (`_'): `\B' Matches the empty string that occurs between two word-constituent - characters. For example, `/\Brat\B/' matches `crate' but it does + characters. For example, `/\Brat\B/' matches `crate', but it does not match `dirty rat'. `\B' is essentially the opposite of `\y'. There are two other operators that work on buffers. In Emacs, a @@ -3976,10 +3987,10 @@ letters, digits, or underscores (`_'): operators are: `\`' - Matches the empty string at the beginning of a buffer (string). + Matches the empty string at the beginning of a buffer (string) `\'' - Matches the empty string at the end of a buffer (string). + Matches the empty string at the end of a buffer (string) Because `^' and `$' always work in terms of the beginning and end of strings, these operators don't add any new capabilities for `awk'. @@ -4150,7 +4161,7 @@ one line. Each record is automatically split into chunks called parts of a record. On rare occasions, you may need to use the `getline' command. The -`getline' command is valuable, both because it can do explicit input +`getline' command is valuable both because it can do explicit input from any number of files, and because the files used with it do not have to be named on the `awk' command line (*note Getline::). @@ -4199,8 +4210,8 @@ File: gawk.info, Node: awk split records, Next: gawk split records, Up: Recor Records are separated by a character called the "record separator". By default, the record separator is the newline character. This is why -records are, by default, single lines. A different character can be -used for the record separator by assigning the character to the +records are, by default, single lines. To use a different character +for the record separator, simply assign that character to the predefined variable `RS'. Like any other variable, the value of `RS' can be changed in the @@ -4215,14 +4226,14 @@ BEGIN/END::). For example: awk 'BEGIN { RS = "u" } { print $0 }' mail-list -changes the value of `RS' to `u', before reading any input. This is a -string whose first character is the letter "u"; as a result, records -are separated by the letter "u." Then the input file is read, and the -second rule in the `awk' program (the action with no pattern) prints -each record. Because each `print' statement adds a newline at the end -of its output, this `awk' program copies the input with each `u' -changed to a newline. Here are the results of running the program on -`mail-list': +changes the value of `RS' to `u', before reading any input. The new +value is a string whose first character is the letter "u"; as a result, +records are separated by the letter "u". Then the input file is read, +and the second rule in the `awk' program (the action with no pattern) +prints each record. Because each `print' statement adds a newline at +the end of its output, this `awk' program copies the input with each +`u' changed to a newline. Here are the results of running the program +on `mail-list': $ awk 'BEGIN { RS = "u" } > { print $0 }' mail-list @@ -4270,11 +4281,11 @@ data file (*note Sample Data Files::), the line looks like this: Bill 555-1675 bill.drowning@hotmail.com A -It contains no `u' so there is no reason to split the record, unlike -the others which have one or more occurrences of the `u'. In fact, -this record is treated as part of the previous record; the newline -separating them in the output is the original newline in the data file, -not the one added by `awk' when it printed the record! +It contains no `u', so there is no reason to split the record, unlike +the others, which each have one or more occurrences of the `u'. In +fact, this record is treated as part of the previous record; the +newline separating them in the output is the original newline in the +data file, not the one added by `awk' when it printed the record! Another way to change the record separator is on the command line, using the variable-assignment feature (*note Other Arguments::): @@ -4340,8 +4351,8 @@ part of either record. character. However, when `RS' is a regular expression, `RT' contains the actual input text that matched the regular expression. - If the input file ended without any text that matches `RS', `gawk' -sets `RT' to the null string. + If the input file ends without any text matching `RS', `gawk' sets +`RT' to the null string. The following example illustrates both of these features. It sets `RS' equal to a regular expression that matches either a newline or a @@ -4439,12 +4450,12 @@ to these pieces of the record. You don't have to use them--you can operate on the whole record if you want--but fields are what make simple `awk' programs so powerful. - You use a dollar-sign (`$') to refer to a field in an `awk' program, + You use a dollar sign (`$') to refer to a field in an `awk' program, followed by the number of the field you want. Thus, `$1' refers to the -first field, `$2' to the second, and so on. (Unlike the Unix shells, -the field numbers are not limited to single digits. `$127' is the -127th field in the record.) For example, suppose the following is a -line of input: +first field, `$2' to the second, and so on. (Unlike in the Unix +shells, the field numbers are not limited to single digits. `$127' is +the 127th field in the record.) For example, suppose the following is +a line of input: This seems like a pretty nice example. @@ -4461,10 +4472,9 @@ as `$7', which is `example.'. If you try to reference a field beyond the last one (such as `$8' when the record has only seven fields), you get the empty string. (If used in a numeric operation, you get zero.) - The use of `$0', which looks like a reference to the "zero-th" -field, is a special case: it represents the whole input record. Use it -when you are not interested in specific fields. Here are some more -examples: + The use of `$0', which looks like a reference to the "zeroth" field, +is a special case: it represents the whole input record. Use it when +you are not interested in specific fields. Here are some more examples: $ awk '$1 ~ /li/ { print $0 }' mail-list -| Amelia 555-5553 amelia.zodiacusque@gmail.com F @@ -4512,8 +4522,8 @@ is another example of using expressions as field numbers: awk '{ print $(2*2) }' mail-list `awk' evaluates the expression `(2*2)' and uses its value as the -number of the field to print. The `*' sign represents multiplication, -so the expression `2*2' evaluates to four. The parentheses are used so +number of the field to print. The `*' represents multiplication, so +the expression `2*2' evaluates to four. The parentheses are used so that the multiplication is done before the `$' operation; they are necessary whenever there is a binary operator(1) in the field-number expression. This example, then, prints the type of relationship (the @@ -4537,7 +4547,7 @@ field number. ---------- Footnotes ---------- (1) A "binary operator", such as `*' for multiplication, is one that -takes two operands. The distinction is required, because `awk' also has +takes two operands. The distinction is required because `awk' also has unary (one-operand) and ternary (three-operand) operators. @@ -4659,7 +4669,7 @@ value of `NF' and recomputes `$0'. (d.c.) Here is an example: decremented. Finally, there are times when it is convenient to force `awk' to -rebuild the entire record, using the current value of the fields and +rebuild the entire record, using the current values of the fields and `OFS'. To do this, use the seemingly innocuous assignment: $1 = $1 # force record to be reconstituted @@ -4679,7 +4689,7 @@ built-in function that updates `$0', such as `sub()' and `gsub()' It is important to remember that `$0' is the _full_ record, exactly as it was read from the input. This includes any leading or trailing whitespace, and the exact whitespace (or other characters) that -separate the fields. +separates the fields. It is a common error to try to change the field separators in a record simply by setting `FS' and `OFS', and then expecting a plain @@ -4747,7 +4757,7 @@ attached, such as: John Q. Smith, LXIX, 29 Oak St., Walamazoo, MI 42139 -The same program would extract `*LXIX', instead of `*29*Oak*St.'. If +The same program would extract `*LXIX' instead of `*29*Oak*St.'. If you were expecting the program to print the address, you would be surprised. The moral is to choose your data layout and separator characters carefully to prevent such problems. (If the data is not in @@ -4946,11 +4956,11 @@ your field and record separators. Perhaps the most common use of a single character as the field separator occurs when processing the Unix system password file. On many Unix systems, each user has a separate entry in the system -password file, one line per user. The information in these lines is -separated by colons. The first field is the user's login name and the -second is the user's encrypted or shadow password. (A shadow password -is indicated by the presence of a single `x' in the second field.) A -password file entry might look like this: +password file, with one line per user. The information in these lines +is separated by colons. The first field is the user's login name and +the second is the user's encrypted or shadow password. (A shadow +password is indicated by the presence of a single `x' in the second +field.) A password file entry might look like this: arnold:x:2076:10:Arnold Robbins:/home/arnold:/bin/bash @@ -4978,15 +4988,14 @@ When you do this, `$1' is the same as `$0'. According to the POSIX standard, `awk' is supposed to behave as if each record is split into fields at the time it is read. In particular, this means that if you change the value of `FS' after a -record is read, the value of the fields (i.e., how they were split) +record is read, the values of the fields (i.e., how they were split) should reflect the old value of `FS', not the new one. However, many older implementations of `awk' do not work this way. Instead, they defer splitting the fields until a field is actually referenced. The fields are split using the _current_ value of `FS'! (d.c.) This behavior can be difficult to diagnose. The following -example illustrates the difference between the two methods. (The -`sed'(2) command prints just the first line of `/etc/passwd'.) +example illustrates the difference between the two methods: sed 1q /etc/passwd | awk '{ FS = ":" ; print $1 }' @@ -4999,6 +5008,8 @@ first line of the file, something like: root:x:0:0:Root:/: + (The `sed'(2) command prints just the first line of `/etc/passwd'.) + ---------- Footnotes ---------- (1) Thanks to Andrew Schorr for this tip. @@ -5089,9 +5100,9 @@ the built-in variable `FIELDWIDTHS'. Each number specifies the width of the field, _including_ columns between fields. If you want to ignore the columns between fields, you can specify the width as a separate field that is subsequently ignored. It is a fatal error to -supply a field width that is not a positive number. The following data -is the output of the Unix `w' utility. It is useful to illustrate the -use of `FIELDWIDTHS': +supply a field width that has a negative value. The following data is +the output of the Unix `w' utility. It is useful to illustrate the use +of `FIELDWIDTHS': 10:06pm up 21 days, 14:04, 23 users User tty login idle JCPU PCPU what @@ -5152,7 +5163,7 @@ run on a system with card readers is another story!) splitting again. Use `FS = FS' to make this happen, without having to know the current value of `FS'. In order to tell which kind of field splitting is in effect, use `PROCINFO["FS"]' (*note Auto-set::). The -value is `"FS"' if regular field splitting is being used, or it is +value is `"FS"' if regular field splitting is being used, or `"FIELDWIDTHS"' if fixed-width field splitting is being used: if (PROCINFO["FS"] == "FS") @@ -5185,10 +5196,10 @@ what they are, and not by what they are not. The most notorious such case is so-called "comma-separated values" (CSV) data. Many spreadsheet programs, for example, can export their data into text files, where each record is terminated with a newline, -and fields are separated by commas. If only commas separated the data, +and fields are separated by commas. If commas only separated the data, there wouldn't be an issue. The problem comes when one of the fields contains an _embedded_ comma. In such cases, most programs embed the -field in double quotes.(1) So we might have data like this: +field in double quotes.(1) So, we might have data like this: Robbins,Arnold,"1234 A Pretty Street, NE",MyTown,MyState,12345-6789,USA @@ -5255,9 +5266,9 @@ being used. provides an elegant solution for the majority of cases, and the `gawk' developers are satisfied with that. - As written, the regexp used for `FPAT' requires that each field have -a least one character. A straightforward modification (changing -changed the first `+' to `*') allows fields to be empty: + As written, the regexp used for `FPAT' requires that each field +contain at least one character. A straightforward modification +(changing the first `+' to `*') allows fields to be empty: FPAT = "([^,]*)|(\"[^\"]+\")" @@ -5265,9 +5276,8 @@ changed the first `+' to `*') allows fields to be empty: available for splitting regular strings (*note String Functions::). To recap, `gawk' provides three independent methods to split input -records into fields. `gawk' uses whichever mechanism was last chosen -based on which of the three variables--`FS', `FIELDWIDTHS', and -`FPAT'--was last assigned to. +records into fields. The mechanism used is based on which of the three +variables--`FS', `FIELDWIDTHS', or `FPAT'--was last assigned to. ---------- Footnotes ---------- @@ -5305,7 +5315,7 @@ empty; lines that contain only whitespace do not count.) `"\n\n+"' to `RS'. This regexp matches the newline at the end of the record and one or more blank lines after the record. In addition, a regular expression always matches the longest possible sequence when -there is a choice (*note Leftmost Longest::). So the next record +there is a choice (*note Leftmost Longest::). So, the next record doesn't start until the first nonblank line that follows--no matter how many blank lines appear in a row, they are considered one record separator. @@ -5317,12 +5327,12 @@ last record, the final newline is removed from the record. In the second case, this special processing is not done. (d.c.) Now that the input is separated into records, the second step is to -separate the fields in the record. One way to do this is to divide each -of the lines into fields in the normal manner. This happens by default -as the result of a special feature. When `RS' is set to the empty -string, _and_ `FS' is set to a single character, the newline character -_always_ acts as a field separator. This is in addition to whatever -field separations result from `FS'.(1) +separate the fields in the records. One way to do this is to divide +each of the lines into fields in the normal manner. This happens by +default as the result of a special feature. When `RS' is set to the +empty string _and_ `FS' is set to a single character, the newline +character _always_ acts as a field separator. This is in addition to +whatever field separations result from `FS'.(1) The original motivation for this special exception was probably to provide useful behavior in the default case (i.e., `FS' is equal to @@ -5330,17 +5340,17 @@ provide useful behavior in the default case (i.e., `FS' is equal to newline character to separate fields, because there is no way to prevent it. However, you can work around this by using the `split()' function to break up the record manually (*note String Functions::). -If you have a single character field separator, you can work around the +If you have a single-character field separator, you can work around the special feature in a different way, by making `FS' into a regexp for that single character. For example, if the field separator is a percent character, instead of `FS = "%"', use `FS = "[%]"'. Another way to separate fields is to put each field on a separate line: to do this, just set the variable `FS' to the string `"\n"'. -(This single character separator matches a single newline.) A +(This single-character separator matches a single newline.) A practical example of a data file organized this way might be a mailing -list, where each entry is separated by blank lines. Consider a mailing -list in a file named `addresses', which looks like this: +list, where blank lines separate the entries. Consider a mailing list +in a file named `addresses', which looks like this: Jane Doe 123 Main Street @@ -5423,7 +5433,7 @@ File: gawk.info, Node: Getline, Next: Read Timeout, Prev: Multiple Line, Up: So far we have been getting our input data from `awk''s main input stream--either the standard input (usually your keyboard, sometimes the -output from another program) or from the files specified on the command +output from another program) or the files specified on the command line. The `awk' language has a special built-in command called `getline' that can be used to read input under your explicit control. @@ -5561,7 +5571,7 @@ and produces these results: free The `getline' command used in this way sets only the variables `NR', -`FNR', and `RT' (and of course, VAR). The record is not split into +`FNR', and `RT' (and, of course, VAR). The record is not split into fields, so the values of the fields (including `$0') and the value of `NF' do not change. @@ -5571,8 +5581,8 @@ File: gawk.info, Node: Getline/File, Next: Getline/Variable/File, Prev: Getli 4.9.3 Using `getline' from a File --------------------------------- -Use `getline < FILE' to read the next record from FILE. Here FILE is a -string-valued expression that specifies the file name. `< FILE' is +Use `getline < FILE' to read the next record from FILE. Here, FILE is +a string-valued expression that specifies the file name. `< FILE' is called a "redirection" because it directs input to come from a different place. For example, the following program reads its input record from the file `secondary.input' when it encounters a first field @@ -5708,8 +5718,8 @@ all `awk' implementations. treatment of a construct like `"echo " "date" | getline'. Most versions, including the current version, treat it at as `("echo " "date") | getline'. (This is also how BWK `awk' behaves.) Some - versions changed and treated it as `"echo " ("date" | getline)'. - (This is how `mawk' behaves.) In short, _always_ use explicit + versions instead treat it as `"echo " ("date" | getline)'. (This + is how `mawk' behaves.) In short, _always_ use explicit parentheses, and then you won't have to worry. @@ -5745,15 +5755,16 @@ File: gawk.info, Node: Getline/Coprocess, Next: Getline/Variable/Coprocess, P 4.9.7 Using `getline' from a Coprocess -------------------------------------- -Input into `getline' from a pipe is a one-way operation. The command -that is started with `COMMAND | getline' only sends data _to_ your -`awk' program. +Reading input into `getline' from a pipe is a one-way operation. The +command that is started with `COMMAND | getline' only sends data _to_ +your `awk' program. On occasion, you might want to send data to another program for processing and then read the results back. `gawk' allows you to start a "coprocess", with which two-way communications are possible. This is done with the `|&' operator. Typically, you write data to the -coprocess first and then read results back, as shown in the following: +coprocess first and then read the results back, as shown in the +following: print "SOME QUERY" |& "db_server" "db_server" |& getline @@ -5815,7 +5826,7 @@ in mind: files. (d.c.) (See *note BEGIN/END::; also *note Auto-set::.) * Using `FILENAME' with `getline' (`getline < FILENAME') is likely - to be a source for confusion. `awk' opens a separate input stream + to be a source of confusion. `awk' opens a separate input stream from the current input file. However, by not using a variable, `$0' and `NF' are still updated. If you're doing this, it's probably by accident, and you should reconsider what it is you're @@ -5823,15 +5834,15 @@ in mind: * *note Getline Summary::, presents a table summarizing the `getline' variants and which variables they can affect. It is - worth noting that those variants which do not use redirection can + worth noting that those variants that do not use redirection can cause `FILENAME' to be updated if they cause `awk' to start reading a new input file. * If the variable being assigned is an expression with side effects, different versions of `awk' behave differently upon encountering end-of-file. Some versions don't evaluate the expression; many - versions (including `gawk') do. Here is an example, due to Duncan - Moore: + versions (including `gawk') do. Here is an example, courtesy of + Duncan Moore: BEGIN { system("echo 1 > f") @@ -5839,8 +5850,8 @@ in mind: print c } - Here, the side effect is the `++c'. Is `c' incremented if end of - file is encountered, before the element in `a' is assigned? + Here, the side effect is the `++c'. Is `c' incremented if + end-of-file is encountered before the element in `a' is assigned? `gawk' treats `getline' like a function call, and evaluates the expression `a[++c]' before attempting to read from `f'. However, @@ -5884,8 +5895,8 @@ This minor node describes a feature that is specific to `gawk'. You may specify a timeout in milliseconds for reading input from the keyboard, a pipe, or two-way communication, including TCP/IP sockets. -This can be done on a per input, command, or connection basis, by -setting a special element in the `PROCINFO' array (*note Auto-set::): +This can be done on a per-input, per-command, or per-connection basis, +by setting a special element in the `PROCINFO' array (*note Auto-set::): PROCINFO["input_name", "READ_TIMEOUT"] = TIMEOUT IN MILLISECONDS @@ -5909,7 +5920,7 @@ for more than five seconds: print $0 `gawk' terminates the read operation if input does not arrive after -waiting for the timeout period, returns failure and sets `ERRNO' to an +waiting for the timeout period, returns failure, and sets `ERRNO' to an appropriate string value. A negative or zero value for the timeout is the same as specifying no timeout at all. @@ -5949,7 +5960,7 @@ input to arrive: environment variable exists, `gawk' uses its value to initialize the timeout value. The exclusive use of the environment variable to specify timeout has the disadvantage of not being able to control it on -a per command or connection basis. +a per-command or per-connection basis. `gawk' considers a timeout event to be an error even though the attempt to read from the underlying device may succeed in a later @@ -6017,7 +6028,7 @@ File: gawk.info, Node: Input Summary, Next: Input Exercises, Prev: Command-li * `gawk' sets `RT' to the text matched by `RS'. * After splitting the input into records, `awk' further splits the - record into individual fields, named `$1', `$2', and so on. `$0' + records into individual fields, named `$1', `$2', and so on. `$0' is the whole record, and `NF' indicates how many fields there are. The default way to split fields is between whitespace characters. @@ -6031,19 +6042,21 @@ File: gawk.info, Node: Input Summary, Next: Input Exercises, Prev: Command-li * Field splitting is more complicated than record splitting: - Field separator value Fields are split ... `awk' / - `gawk' + Field separator value Fields are split ... `awk' / + `gawk' ---------------------------------------------------------------------- - `FS == " "' On runs of whitespace `awk' - `FS == ANY SINGLE On that character `awk' - CHARACTER' - `FS == REGEXP' On text matching the regexp `awk' - `FS == ""' Each individual character is `gawk' - a separate field - `FIELDWIDTHS == LIST OF Based on character position `gawk' - COLUMNS' - `FPAT == REGEXP' On the text surrounding text `gawk' - matching the regexp + `FS == " "' On runs of whitespace `awk' + `FS == ANY SINGLE On that character `awk' + CHARACTER' + `FS == REGEXP' On text matching the `awk' + regexp + `FS == ""' Such that each individual `gawk' + character is a separate + field + `FIELDWIDTHS == LIST OF Based on character `gawk' + COLUMNS' position + `FPAT == REGEXP' On the text surrounding `gawk' + text matching the regexp * Using `FS = "\n"' causes the entire record to be a single field (assuming that newlines separate records). @@ -6053,12 +6066,11 @@ File: gawk.info, Node: Input Summary, Next: Input Exercises, Prev: Command-li * Use `PROCINFO["FS"]' to see how fields are being split. - * Use `getline' in its various forms to read additional records, - from the default input stream, from a file, or from a pipe or - coprocess. + * Use `getline' in its various forms to read additional records from + the default input stream, from a file, or from a pipe or coprocess. - * Use `PROCINFO[FILE, "READ_TIMEOUT"]' to cause reads to timeout for - FILE. + * Use `PROCINFO[FILE, "READ_TIMEOUT"]' to cause reads to time out + for FILE. * Directories on the command line are fatal for standard `awk'; `gawk' ignores them if not in POSIX mode. @@ -6118,6 +6130,7 @@ function. `gawk' allows access to inherited file descriptors. * Close Files And Pipes:: Closing Input and Output Files and Pipes. +* Nonfatal:: Enabling Nonfatal Output. * Output Summary:: Output summary. * Output Exercises:: Exercises. @@ -6152,7 +6165,7 @@ you will probably get an error. Keep in mind that a space is printed between any two items. Note that the `print' statement is a statement and not an -expression--you can't use it in the pattern part of a PATTERN-ACTION +expression--you can't use it in the pattern part of a pattern-action statement, for example. @@ -6300,7 +6313,7 @@ File: gawk.info, Node: OFMT, Next: Printf, Prev: Output Separators, Up: Prin =========================================== When printing numeric values with the `print' statement, `awk' -internally converts the number to a string of characters and prints +internally converts each number to a string of characters and prints that string. `awk' uses the `sprintf()' function to do this conversion (*note String Functions::). For now, it suffices to say that the `sprintf()' function accepts a "format specification" that tells it how @@ -6355,7 +6368,7 @@ A simple `printf' statement looks like this: As for `print', the entire list of arguments may optionally be enclosed in parentheses. Here too, the parentheses are necessary if any of the -item expressions use the `>' relational operator; otherwise, it can be +item expressions uses the `>' relational operator; otherwise, it can be confused with an output redirection (*note Redirection::). The difference between `printf' and `print' is the FORMAT argument. @@ -6382,7 +6395,7 @@ statements. For example: > }' -| Don't Panic! -Here, neither the `+' nor the `OUCH!' appear in the output message. +Here, neither the `+' nor the `OUCH!' appears in the output message. File: gawk.info, Node: Control Letters, Next: Format Modifiers, Prev: Basic Printf, Up: Printf @@ -6421,7 +6434,7 @@ width. Here is a list of the format-control letters: (The `%i' specification is for compatibility with ISO C.) `%e', `%E' - Print a number in scientific (exponential) notation; for example: + Print a number in scientific (exponential) notation. For example: printf "%4.3e\n", 1950 @@ -6446,7 +6459,7 @@ width. Here is a list of the format-control letters: Math Definitions::). `%F' - Like `%f' but the infinity and "not a number" values are spelled + Like `%f', but the infinity and "not a number" values are spelled using uppercase letters. The `%F' format is a POSIX extension to ISO C; not all systems @@ -6515,7 +6528,7 @@ which they may appear: messages at runtime. *Note Printf Ordering::, which describes how and why to use positional specifiers. For now, we ignore them. -`- (Minus)' +`-' (Minus) The minus sign, used before the width modifier (see later on in this list), says to left-justify the argument within its specified width. Normally, the argument is printed right-justified in the @@ -6525,7 +6538,7 @@ which they may appear: prints `foo*'. -`SPACE' +SPACE For numeric conversions, prefix positive values with a space and negative values with a minus sign. @@ -6570,7 +6583,7 @@ which they may appear: programs. For information on appropriate quoting tricks, see *note Quoting::. -`WIDTH' +WIDTH This is a number specifying the desired minimum width of a field. Inserting any number between the `%' sign and the format-control character forces the field to expand to this width. The default @@ -6640,7 +6653,7 @@ string, like so: s = "abcdefg" printf "%" w "." p "s\n", s -This is not particularly easy to read but it does work. +This is not particularly easy to read, but it does work. C programmers may be used to supplying additional modifiers (`h', `j', `l', `L', `t', and `z') in `printf' format strings. These are not @@ -6679,7 +6692,7 @@ an aligned two-column table of names and phone numbers, as shown here: -| Jean-Paul 555-2127 In this case, the phone numbers had to be printed as strings because -the numbers are separated by a dash. Printing the phone numbers as +the numbers are separated by dashes. Printing the phone numbers as numbers would have produced just the first three digits: `555'. This would have been pretty confusing. @@ -6727,7 +6740,7 @@ output, usually the screen. Both `print' and `printf' can also send their output to other places. This is called "redirection". NOTE: When `--sandbox' is specified (*note Options::), redirecting - output to files, pipes and coprocesses is disabled. + output to files, pipes, and coprocesses is disabled. A redirection appears after the `print' or `printf' statement. Redirections in `awk' are written just like redirections in shell @@ -6767,7 +6780,7 @@ work identically for `printf': Each output file contains one name or number per line. `print ITEMS >> OUTPUT-FILE' - This redirection prints the items into the pre-existing output file + This redirection prints the items into the preexisting output file named OUTPUT-FILE. The difference between this and the single-`>' redirection is that the old contents (if any) of OUTPUT-FILE are not erased. Instead, the `awk' output is appended to the file. @@ -6815,8 +6828,8 @@ work identically for `printf': `print ITEMS |& COMMAND' This redirection prints the items to the input of COMMAND. The difference between this and the single-`|' redirection is that the - output from COMMAND can be read with `getline'. Thus COMMAND is a - "coprocess", which works together with, but subsidiary to, the + output from COMMAND can be read with `getline'. Thus, COMMAND is + a "coprocess", which works together with but is subsidiary to the `awk' program. This feature is a `gawk' extension, and is not available in POSIX @@ -6840,7 +6853,7 @@ a file, and then to use `>>' for subsequent output: This is indeed how redirections must be used from the shell. But in `awk', it isn't necessary. In this kind of case, a program should use `>' for all the `print' statements, because the output file is only -opened once. (It happens that if you mix `>' and `>>' that output is +opened once. (It happens that if you mix `>' and `>>' output is produced in the expected order. However, mixing the operators for the same file is definitely poor style, and is confusing to readers of your program.) @@ -6873,14 +6886,14 @@ command lines to be fed to the shell. File: gawk.info, Node: Special FD, Next: Special Files, Prev: Redirection, Up: Printing -5.7 Special Files for Standard Pre-Opened Data Streams -====================================================== +5.7 Special Files for Standard Preopened Data Streams +===================================================== Running programs conventionally have three input and output streams already available to them for reading and writing. These are known as the "standard input", "standard output", and "standard error output". -These open streams (and any other open file or pipe) are often referred -to by the technical term "file descriptors". +These open streams (and any other open files or pipes) are often +referred to by the technical term "file descriptors". These streams are, by default, connected to your keyboard and screen, but they are often redirected with the shell, via the `<', `<<', @@ -6905,7 +6918,7 @@ error messages to the screen, like this: (`/dev/tty' is a special file supplied by the operating system that is connected to your keyboard and screen. It represents the "terminal,"(1) which on modern systems is a keyboard and screen, not a serial console.) -This generally has the same effect but not always: although the +This generally has the same effect, but not always: although the standard error stream is usually the screen, it can be redirected; when that happens, writing to the screen is not correct. In fact, if `awk' is run from a background job, it may not have a terminal at all. Then @@ -6932,7 +6945,7 @@ becomes: print "Serious error detected!" > "/dev/stderr" - Note the use of quotes around the file name. Like any other + Note the use of quotes around the file name. Like with any other redirection, the value must be a string. It is a common error to omit the quotes, which leads to confusing results. @@ -6948,7 +6961,7 @@ option (*note Options::). File: gawk.info, Node: Special Files, Next: Close Files And Pipes, Prev: Special FD, Up: Printing -5.8 Special File Names in `gawk' +5.8 Special File names in `gawk' ================================ Besides access to standard input, standard output, and standard error, @@ -6965,7 +6978,7 @@ there are special file names reserved for TCP/IP networking. File: gawk.info, Node: Other Inherited Files, Next: Special Network, Up: Special Files -5.8.1 Accessing Other Open Files With `gawk' +5.8.1 Accessing Other Open Files with `gawk' -------------------------------------------- Besides the `/dev/stdin', `/dev/stdout', and `/dev/stderr' special file @@ -7009,13 +7022,13 @@ mentioned here only for completeness. Full discussion is delayed until File: gawk.info, Node: Special Caveats, Prev: Special Network, Up: Special Files -5.8.3 Special File Name Caveats +5.8.3 Special File name Caveats ------------------------------- Here are some things to bear in mind when using the special file names that `gawk' provides: - * Recognition of the file names for the three standard pre-opened + * Recognition of the file names for the three standard preopened files is disabled only in POSIX mode. * Recognition of the other special file names is disabled if `gawk' @@ -7024,14 +7037,14 @@ that `gawk' provides: * `gawk' _always_ interprets these special file names. For example, using `/dev/fd/4' for output actually writes on file descriptor 4, - and not on a new file descriptor that is `dup()''ed from file + and not on a new file descriptor that is `dup()'ed from file descriptor 4. Most of the time this does not matter; however, it is important to _not_ close any of the files related to file descriptors 0, 1, and 2. Doing so results in unpredictable behavior. -File: gawk.info, Node: Close Files And Pipes, Next: Output Summary, Prev: Special Files, Up: Printing +File: gawk.info, Node: Close Files And Pipes, Next: Nonfatal, Prev: Special Files, Up: Printing 5.9 Closing Input and Output Redirections ========================================= @@ -7184,8 +7197,8 @@ closing input or output files, respectively. This value is zero if the close succeeds, or -1 if it fails. The POSIX standard is very vague; it says that `close()' returns -zero on success and nonzero otherwise. In general, different -implementations vary in what they report when closing pipes; thus the +zero on success and a nonzero value otherwise. In general, different +implementations vary in what they report when closing pipes; thus, the return value cannot be used portably. (d.c.) In POSIX mode (*note Options::), `gawk' just returns zero when closing a pipe. @@ -7200,9 +7213,68 @@ call. See the system manual pages for information on how to decode this value. -File: gawk.info, Node: Output Summary, Next: Output Exercises, Prev: Close Files And Pipes, Up: Printing +File: gawk.info, Node: Nonfatal, Next: Output Summary, Prev: Close Files And Pipes, Up: Printing + +5.10 Enabling Nonfatal Output +============================= + +This minor node describes a `gawk'-specific feature. + + In standard `awk', output with `print' or `printf' to a nonexistent +file, or some other I/O error (such as filling up the disk) is a fatal +error. + + $ gawk 'BEGIN { print "hi" > "/no/such/file" }' + error--> gawk: cmd. line:1: fatal: can't redirect to `/no/such/file' (No such file or directory) + + `gawk' makes it possible to detect that an error has occurred, +allowing you to possibly recover from the error, or at least print an +error message of your choosing before exiting. You can do this in one +of two ways: + + * For all output files, by assigning any value to + `PROCINFO["NONFATAL"]'. + + * On a per-file basis, by assigning any value to `PROCINFO[FILENAME, + "NONFATAL"]'. Here, FILENAME is the name of the file to which you + wish output to be nonfatal. + + Once you have enabled nonfatal output, you must check `ERRNO' after +every relevant `print' or `printf' statement to see if something went +wrong. It is also a good idea to initialize `ERRNO' to zero before +attempting the output. For example: + + $ gawk ' + > BEGIN { + > PROCINFO["NONFATAL"] = 1 + > ERRNO = 0 + > print "hi" > "/no/such/file" + > if (ERRNO) { + > print("Output failed:", ERRNO) > "/dev/stderr" + > exit 1 + > } + > }' + error--> Output failed: No such file or directory + + Here, `gawk' did not produce a fatal error; instead it let the `awk' +program code detect the problem and handle it. + + This mechanism works also for standard output and standard error. +For standard output, you may use `PROCINFO["-", "NONFATAL"]' or +`PROCINFO["/dev/stdout", "NONFATAL"]'. For standard error, use +`PROCINFO["/dev/stderr", "NONFATAL"]'. + + When attempting to open a TCP/IP socket (*note TCP/IP Networking::), +`gawk' tries multiple times. The `GAWK_SOCK_RETRIES' environment +variable (*note Other Environment Variables::) allows you to override +`gawk''s builtin default number of attempts. However, once nonfatal +I/O is enabled for a given socket, `gawk' only retries once, relying on +`awk'-level code to notice that there was a problem. + + +File: gawk.info, Node: Output Summary, Next: Output Exercises, Prev: Nonfatal, Up: Printing -5.10 Summary +5.11 Summary ============ * The `print' statement prints comma-separated expressions. Each @@ -7211,8 +7283,8 @@ File: gawk.info, Node: Output Summary, Next: Output Exercises, Prev: Close Fi numeric values for the `print' statement. * The `printf' statement provides finer-grained control over output, - with format control letters for different data types and various - flags that modify the behavior of the format control letters. + with format-control letters for different data types and various + flags that modify the behavior of the format-control letters. * Output from both `print' and `printf' may be redirected to files, pipes, and coprocesses. @@ -7224,11 +7296,16 @@ File: gawk.info, Node: Output Summary, Next: Output Exercises, Prev: Close Fi For coprocesses, it is possible to close only one direction of the communications. + * Normally errors with `print' or `printf' are fatal. `gawk' lets + you make output errors be nonfatal either for all files or on a + per-file basis. You must then check for errors after every + relevant output statement. + File: gawk.info, Node: Output Exercises, Prev: Output Summary, Up: Printing -5.11 Exercises +5.12 Exercises ============== 1. Rewrite the program: @@ -7263,9 +7340,9 @@ value to a variable or a field by using an assignment operator. An expression can serve as a pattern or action statement on its own. Most other kinds of statements contain one or more expressions that specify the data on which to operate. As in other languages, -expressions in `awk' include variables, array references, constants, -and function calls, as well as combinations of these with various -operators. +expressions in `awk' can include variables, array references, +constants, and function calls, as well as combinations of these with +various operators. * Menu: @@ -7284,8 +7361,8 @@ File: gawk.info, Node: Values, Next: All Operators, Up: Expressions ========================================= Expressions are built up from values and the operations performed upon -them. This minor node describes the elementary objects which provide -the values used in expressions. +them. This minor node describes the elementary objects that provide the +values used in expressions. * Menu: @@ -7330,14 +7407,14 @@ the same value: 1.05e+2 1050e-1 - A string constant consists of a sequence of characters enclosed in + A "string constant" consists of a sequence of characters enclosed in double quotation marks. For example: "parrot" represents the string whose contents are `parrot'. Strings in `gawk' can be of any length, and they can contain any of the possible -eight-bit ASCII characters including ASCII NUL (character code zero). +eight-bit ASCII characters, including ASCII NUL (character code zero). Other `awk' implementations may have difficulty with some character codes. @@ -7357,14 +7434,14 @@ File: gawk.info, Node: Nondecimal-numbers, Next: Regexp Constants, Prev: Scal In `awk', all numbers are in decimal (i.e., base 10). Many other programming languages allow you to specify numbers in other bases, often octal (base 8) and hexadecimal (base 16). In octal, the numbers go 0, -1, 2, 3, 4, 5, 6, 7, 10, 11, 12, and so on. Just as `11', in decimal, -is 1 times 10 plus 1, so `11', in octal, is 1 times 8, plus 1. This -equals 9 in decimal. In hexadecimal, there are 16 digits. Because the -everyday decimal number system only has ten digits (`0'-`9'), the -letters `a' through `f' are used to represent the rest. (Case in the -letters is usually irrelevant; hexadecimal `a' and `A' have the same -value.) Thus, `11', in hexadecimal, is 1 times 16 plus 1, which equals -17 in decimal. +1, 2, 3, 4, 5, 6, 7, 10, 11, 12, and so on. Just as `11' in decimal is +1 times 10 plus 1, so `11' in octal is 1 times 8 plus 1. This equals 9 +in decimal. In hexadecimal, there are 16 digits. Because the everyday +decimal number system only has ten digits (`0'-`9'), the letters `a' +through `f' are used to represent the rest. (Case in the letters is +usually irrelevant; hexadecimal `a' and `A' have the same value.) +Thus, `11' in hexadecimal is 1 times 16 plus 1, which equals 17 in +decimal. Just by looking at plain `11', you can't tell what base it's in. So, in C, C++, and other languages derived from C, there is a special @@ -7372,13 +7449,13 @@ notation to signify the base. Octal numbers start with a leading `0', and hexadecimal numbers start with a leading `0x' or `0X': `11' - Decimal value 11. + Decimal value 11 `011' - Octal 11, decimal value 9. + Octal 11, decimal value 9 `0x11' - Hexadecimal 11, decimal value 17. + Hexadecimal 11, decimal value 17 This example shows the difference: @@ -7397,11 +7474,11 @@ really need to do this, use the `--non-decimal-data' command-line option; *note Nondecimal Data::.) If you have octal or hexadecimal data, you can use the `strtonum()' function (*note String Functions::) to convert the data into a number. Most of the time, you will want to -use octal or hexadecimal constants when working with the built-in bit -manipulation functions; see *note Bitwise Functions::, for more +use octal or hexadecimal constants when working with the built-in +bit-manipulation functions; see *note Bitwise Functions::, for more information. - Unlike some early C implementations, `8' and `9' are not valid in + Unlike in some early C implementations, `8' and `9' are not valid in octal constants. For example, `gawk' treats `018' as decimal 18: $ gawk 'BEGIN { print "021 is", 021 ; print 018 }' @@ -7428,12 +7505,12 @@ File: gawk.info, Node: Regexp Constants, Prev: Nondecimal-numbers, Up: Consta 6.1.1.3 Regular Expression Constants .................................... -A regexp constant is a regular expression description enclosed in +A "regexp constant" is a regular expression description enclosed in slashes, such as `/^beginning and end$/'. Most regexps used in `awk' programs are constant, but the `~' and `!~' matching operators can also match computed or dynamic regexps (which are typically just ordinary -strings or variables that contain a regexp, but could be a more complex -expression). +strings or variables that contain a regexp, but could be more complex +expressions). File: gawk.info, Node: Using Constant Regexps, Next: Variables, Prev: Constants, Up: Values @@ -7485,7 +7562,7 @@ and `patsplit()' functions (*note String Functions::). Modern implementations of `awk', including `gawk', allow the third argument of `split()' to be a regexp constant, but some older implementations do not. (d.c.) Because some built-in functions accept regexp constants -as arguments, it can be confusing when attempting to use regexp +as arguments, confusion can arise when attempting to use regexp constants as arguments to user-defined functions (*note User-defined::). For example: @@ -7508,10 +7585,11 @@ User-defined::). For example: In this example, the programmer wants to pass a regexp constant to the user-defined function `mysub()', which in turn passes it on to either `sub()' or `gsub()'. However, what really happens is that the -`pat' parameter is either one or zero, depending upon whether or not -`$0' matches `/hi/'. `gawk' issues a warning when it sees a regexp -constant used as a parameter to a user-defined function, because -passing a truth value in this way is probably not what was intended. +`pat' parameter is assigned a value of either one or zero, depending +upon whether or not `$0' matches `/hi/'. `gawk' issues a warning when +it sees a regexp constant used as a parameter to a user-defined +function, because passing a truth value in this way is probably not +what was intended. File: gawk.info, Node: Variables, Next: Conversion, Prev: Using Constant Regexps, Up: Values @@ -7519,7 +7597,7 @@ File: gawk.info, Node: Variables, Next: Conversion, Prev: Using Constant Rege 6.1.3 Variables --------------- -Variables are ways of storing values at one point in your program for +"Variables" are ways of storing values at one point in your program for use later in another part of your program. They can be manipulated entirely within the program text, and they can also be assigned values on the `awk' command line. @@ -7548,14 +7626,14 @@ variables. A variable name is a valid expression by itself; it represents the variable's current value. Variables are given new values with -"assignment operators", "increment operators", and "decrement -operators". *Note Assignment Ops::. In addition, the `sub()' and -`gsub()' functions can change a variable's value, and the `match()', -`split()', and `patsplit()' functions can change the contents of their -array parameters. *Note String Functions::. +"assignment operators", "increment operators", and "decrement operators" +(*note Assignment Ops::). In addition, the `sub()' and `gsub()' +functions can change a variable's value, and the `match()', `split()', +and `patsplit()' functions can change the contents of their array +parameters (*note String Functions::). A few variables have special built-in meanings, such as `FS' (the -field separator), and `NF' (the number of fields in the current input +field separator) and `NF' (the number of fields in the current input record). *Note Built-in Variables::, for a list of the predefined variables. These predefined variables can be used and assigned just like all other variables, but their values are also used or changed @@ -7752,7 +7830,7 @@ point, so the default behavior was restored to use a period as the decimal point character. You can use the `--use-lc-numeric' option (*note Options::) to force `gawk' to use the locale's decimal point character. (`gawk' also uses the locale's decimal point character when -in POSIX mode, either via `--posix', or the `POSIXLY_CORRECT' +in POSIX mode, either via `--posix' or the `POSIXLY_CORRECT' environment variable, as shown previously.) *note table-locale-affects:: describes the cases in which the @@ -7768,10 +7846,10 @@ Input Use period Use locale Table 6.1: Locale decimal point versus a period - Finally, modern day formal standards and IEEE standard floating-point -representation can have an unusual but important effect on the way -`gawk' converts some special string values to numbers. The details are -presented in *note POSIX Floating Point Problems::. + Finally, modern-day formal standards and the IEEE standard +floating-point representation can have an unusual but important effect +on the way `gawk' converts some special string values to numbers. The +details are presented in *note POSIX Floating Point Problems::. File: gawk.info, Node: All Operators, Next: Truth Values and Conditions, Prev: Values, Up: Expressions @@ -7779,7 +7857,7 @@ File: gawk.info, Node: All Operators, Next: Truth Values and Conditions, Prev 6.2 Operators: Doing Something with Values ========================================== -This minor node introduces the "operators" which make use of the values +This minor node introduces the "operators" that make use of the values provided by constants and variables. * Menu: @@ -7960,7 +8038,7 @@ you'll get. ---------- Footnotes ---------- - (1) It happens that BWK `awk', `gawk' and `mawk' all "get it right," + (1) It happens that BWK `awk', `gawk', and `mawk' all "get it right," but you should not rely on this. @@ -8077,7 +8155,7 @@ righthand expression. For example: The indices of `bar' are practically guaranteed to be different, because `rand()' returns different values each time it is called. (Arrays and the `rand()' function haven't been covered yet. *Note Arrays::, and -*note Numeric Functions::, for more information). This example +*note Numeric Functions::, for more information.) This example illustrates an important fact about assignment operators: the lefthand expression is only evaluated _once_. @@ -8095,14 +8173,14 @@ converted to a number. Operator Effect -------------------------------------------------------------------------- -LVALUE `+=' INCREMENT Add INCREMENT to the value of LVALUE -LVALUE `-=' DECREMENT Subtract DECREMENT from the value of LVALUE -LVALUE `*=' Multiply the value of LVALUE by COEFFICIENT +LVALUE `+=' INCREMENT Add INCREMENT to the value of LVALUE. +LVALUE `-=' DECREMENT Subtract DECREMENT from the value of LVALUE. +LVALUE `*=' Multiply the value of LVALUE by COEFFICIENT. COEFFICIENT -LVALUE `/=' DIVISOR Divide the value of LVALUE by DIVISOR -LVALUE `%=' MODULUS Set LVALUE to its remainder by MODULUS -LVALUE `^=' POWER -LVALUE `**=' POWER Raise LVALUE to the power POWER (c.e.) +LVALUE `/=' DIVISOR Divide the value of LVALUE by DIVISOR. +LVALUE `%=' MODULUS Set LVALUE to its remainder by MODULUS. +LVALUE `^=' POWER Raise LVALUE to the power POWER. +LVALUE `**=' POWER Raise LVALUE to the power POWER. (c.e.) Table 6.2: Arithmetic assignment operators @@ -8187,8 +8265,8 @@ is a summary of increment and decrement expressions: Operator Evaluation Order - Doctor, doctor! It hurts when I do this! - So don't do that! -- Groucho Marx + Doctor, it hurts when I do this! + Then don't do that! -- Groucho Marx What happens for something like the following? @@ -8203,7 +8281,7 @@ Or something even stranger? In other words, when do the various side effects prescribed by the postfix operators (`b++') take effect? When side effects happen is -"implementation defined". In other words, it is up to the particular +"implementation-defined". In other words, it is up to the particular version of `awk'. The result for the first example may be 12 or 13, and for the second, it may be 22 or 23. @@ -8218,7 +8296,7 @@ File: gawk.info, Node: Truth Values and Conditions, Next: Function Calls, Pre =============================== In certain contexts, expression values also serve as "truth values"; -(i.e., they determine what should happen next as the program runs). This +i.e., they determine what should happen next as the program runs. This minor node describes how `awk' defines "true" and "false" and how values are compared. @@ -8272,10 +8350,10 @@ File: gawk.info, Node: Typing and Comparison, Next: Boolean Ops, Prev: Truth The Guide is definitive. Reality is frequently inaccurate. -- Douglas Adams, `The Hitchhiker's Guide to the Galaxy' - Unlike other programming languages, `awk' variables do not have a -fixed type. Instead, they can be either a number or a string, depending -upon the value that is assigned to them. We look now at how variables -are typed, and how `awk' compares variables. + Unlike in other programming languages, in `awk' variables do not +have a fixed type. Instead, they can be either a number or a string, +depending upon the value that is assigned to them. We look now at how +variables are typed, and how `awk' compares variables. * Menu: @@ -8296,16 +8374,16 @@ of the variable is important because the types of two variables determine how they are compared. Variable typing follows these rules: * A numeric constant or the result of a numeric operation has the - NUMERIC attribute. + "numeric" attribute. * A string constant or the result of a string operation has the - STRING attribute. + "string" attribute. * Fields, `getline' input, `FILENAME', `ARGV' elements, `ENVIRON' elements, and the elements of an array created by `match()', `split()', and `patsplit()' that are numeric strings have the - STRNUM attribute. Otherwise, they have the STRING attribute. - Uninitialized variables also have the STRNUM attribute. + "strnum" attribute. Otherwise, they have the "string" attribute. + Uninitialized variables also have the "strnum" attribute. * Attributes propagate across assignments but are not changed by any use. @@ -8347,12 +8425,13 @@ constant, then a string comparison is performed. Otherwise, a numeric comparison is performed. This point bears additional emphasis: All user input is made of -characters, and so is first and foremost of STRING type; input strings -that look numeric are additionally given the STRNUM attribute. Thus, -the six-character input string ` +3.14' receives the STRNUM attribute. +characters, and so is first and foremost of string type; input strings +that look numeric are additionally given the strnum attribute. Thus, +the six-character input string ` +3.14' receives the strnum attribute. In contrast, the eight characters `" +3.14"' appearing in program text comprise a string constant. The following examples print `1' when the -comparison between the two different constants is true, `0' otherwise: +comparison between the two different constants is true, and `0' +otherwise: $ echo ' +3.14' | awk '{ print($0 == " +3.14") }' True -| 1 @@ -8451,7 +8530,7 @@ comparison is: -| false the result is `false' because both `$1' and `$2' are user input. They -are numeric strings--therefore both have the STRNUM attribute, +are numeric strings--therefore both have the strnum attribute, dictating a numeric comparison. The purpose of the comparison rules and the use of numeric strings is to attempt to produce the behavior that is "least surprising," while still "doing the right thing." @@ -8510,7 +8589,7 @@ is an example to illustrate the difference, in an `en_US.UTF-8' locale: ---------- Footnotes ---------- (1) Technically, string comparison is supposed to behave the same -way as if the strings are compared with the C `strcoll()' function. +way as if the strings were compared with the C `strcoll()' function. File: gawk.info, Node: Boolean Ops, Next: Conditional Exp, Prev: Typing and Comparison, Up: Truth Values and Conditions @@ -8573,7 +8652,7 @@ Boolean operators are: The `&&' and `||' operators are called "short-circuit" operators because of the way they work. Evaluation of the full expression is -"short-circuited" if the result can be determined part way through its +"short-circuited" if the result can be determined partway through its evaluation. Statements that end with `&&' or `||' can be continued simply by @@ -8626,15 +8705,15 @@ File: gawk.info, Node: Conditional Exp, Prev: Boolean Ops, Up: Truth Values a A "conditional expression" is a special kind of expression that has three operands. It allows you to use one expression's value to select -one of two other expressions. The conditional expression is the same -as in the C language, as shown here: +one of two other expressions. The conditional expression in `awk' is +the same as in the C language, as shown here: SELECTOR ? IF-TRUE-EXP : IF-FALSE-EXP There are three subexpressions. The first, SELECTOR, is always computed first. If it is "true" (not zero or not null), then -IF-TRUE-EXP is computed next and its value becomes the value of the -whole expression. Otherwise, IF-FALSE-EXP is computed next and its +IF-TRUE-EXP is computed next, and its value becomes the value of the +whole expression. Otherwise, IF-FALSE-EXP is computed next, and its value becomes the value of the whole expression. For example, the following expression produces the absolute value of `x': @@ -8668,7 +8747,7 @@ A "function" is a name for a particular calculation. This enables you to ask for it by name at any point in the program. For example, the function `sqrt()' computes the square root of a number. - A fixed set of functions are "built-in", which means they are + A fixed set of functions are "built in", which means they are available in every `awk' program. The `sqrt()' function is one of these. *Note Built-in::, for a list of built-in functions and their descriptions. In addition, you can define functions for use in your @@ -8803,7 +8882,7 @@ precedence: Increment, decrement. `^ **' - Exponentiation. These operators group right-to-left. + Exponentiation. These operators group right to left. `+ - !' Unary plus, minus, logical "not." @@ -8830,7 +8909,7 @@ String concatenation operand of another operator. As a result, it does not make sense to use a redirection operator near another operator of lower precedence without parentheses. Such combinations (e.g., `print - foo > a ? b : c'), result in syntax errors. The correct way to + foo > a ? b : c') result in syntax errors. The correct way to write this statement is `print foo > (a ? b : c)'. `~ !~' @@ -8840,16 +8919,16 @@ String concatenation Array membership. `&&' - Logical "and". + Logical "and." `||' - Logical "or". + Logical "or." `?:' - Conditional. This operator groups right-to-left. + Conditional. This operator groups right to left. `= += -= *= /= %= ^= **=' - Assignment. These operators group right-to-left. + Assignment. These operators group right to left. NOTE: The `|&', `**', and `**=' operators are not specified by POSIX. For maximum portability, do not use them. @@ -8917,24 +8996,24 @@ File: gawk.info, Node: Expressions Summary, Prev: Locales, Up: Expressions * `awk' provides the usual arithmetic operators (addition, subtraction, multiplication, division, modulus), and unary plus - and minus. It also provides comparison operators, boolean - operators, array membership testing, and regexp matching - operators. String concatenation is accomplished by placing two - expressions next to each other; there is no explicit operator. - The three-operand `?:' operator provides an "if-else" test within - expressions. + and minus. It also provides comparison operators, Boolean + operators, an array membership testing operator, and regexp + matching operators. String concatenation is accomplished by + placing two expressions next to each other; there is no explicit + operator. The three-operand `?:' operator provides an "if-else" + test within expressions. * Assignment operators provide convenient shorthands for common arithmetic operations. - * In `awk', a value is considered to be true if it is non-zero _or_ + * In `awk', a value is considered to be true if it is nonzero _or_ non-null. Otherwise, the value is false. * A variable's type is set upon each assignment and may change over its lifetime. The type determines how it behaves in comparisons (string or numeric). - * Function calls return a value which may be used as part of a larger + * Function calls return a value that may be used as part of a larger expression. Expressions used to pass parameter values are fully evaluated before the function is called. `awk' provides built-in and user-defined functions; this is described in *note Functions::. @@ -9066,9 +9145,10 @@ accepts any record with a first field that contains `li': -| 555-5553 -| 555-6699 - pattern. The expression `/li/' has the value one if `li' appears in -the current input record. Thus, as a pattern, `/li/' matches any record -containing `li'. + A regexp constant as a pattern is also a special case of an +expression pattern. The expression `/li/' has the value one if `li' +appears in the current input record. Thus, as a pattern, `/li/' matches +any record containing `li'. Boolean expressions are also commonly used as patterns. Whether the pattern matches an input record depends on whether its subexpressions @@ -9108,7 +9188,7 @@ inside Boolean patterns. Likewise, the special patterns `BEGIN', `END', `BEGINFILE', and `ENDFILE', which never match any input record, are not expressions and cannot appear inside Boolean patterns. - The precedence of the different operators which can appear in + The precedence of the different operators that can appear in patterns is described in *note Precedence::. @@ -9128,8 +9208,8 @@ following: prints every record in `myfile' between `on'/`off' pairs, inclusive. A range pattern starts out by matching BEGPAT against every input -record. When a record matches BEGPAT, the range pattern is "turned on" -and the range pattern matches this record as well. As long as the +record. When a record matches BEGPAT, the range pattern is "turned +on", and the range pattern matches this record as well. As long as the range pattern stays turned on, it automatically matches every input record read. The range pattern also matches ENDPAT against every input record; when this succeeds, the range pattern is "turned off" again for @@ -9247,7 +9327,7 @@ for more information on using library functions. *Note Library Functions::, for a number of useful library functions. If an `awk' program has only `BEGIN' rules and no other rules, then -the program exits after the `BEGIN' rule is run.(1) However, if an +the program exits after the `BEGIN' rules are run.(1) However, if an `END' rule exists, then the input is read, even if there are no other rules in the program. This is necessary in case the `END' rule checks the `FNR' and `NR' variables. @@ -9273,7 +9353,7 @@ give `$0' a real value is to execute a `getline' command without a variable (*note Getline::). Another way is simply to assign a value to `$0'. - The second point is similar to the first but from the other + The second point is similar to the first, but from the other direction. Traditionally, due largely to implementation issues, `$0' and `NF' were _undefined_ inside an `END' rule. The POSIX standard specifies that `NF' is available in an `END' rule. It contains the @@ -9334,7 +9414,7 @@ tasks that would otherwise be difficult or impossible to perform: entirely. Otherwise, `gawk' exits with the usual fatal error. * If you have written extensions that modify the record handling (by - inserting an "input parser," *note Input Parsers::), you can invoke + inserting an "input parser"; *note Input Parsers::), you can invoke them at this point, before `gawk' has started processing the file. (This is a _very_ advanced feature, currently used only by the `gawkextlib' project (http://gawkextlib.sourceforge.net).) @@ -9344,16 +9424,15 @@ last record in an input file. For the last input file, it will be called before any `END' rules. The `ENDFILE' rule is executed even for empty input files. - Normally, when an error occurs when reading input in the normal input -processing loop, the error is fatal. However, if an `ENDFILE' rule is -present, the error becomes non-fatal, and instead `ERRNO' is set. This -makes it possible to catch and process I/O errors at the level of the -`awk' program. + Normally, when an error occurs when reading input in the normal +input-processing loop, the error is fatal. However, if an `ENDFILE' +rule is present, the error becomes non-fatal, and instead `ERRNO' is +set. This makes it possible to catch and process I/O errors at the +level of the `awk' program. The `next' statement (*note Next Statement::) is not allowed inside either a `BEGINFILE' or an `ENDFILE' rule. The `nextfile' statement is -allowed only inside a `BEGINFILE' rule, but not inside an `ENDFILE' -rule. +allowed only inside a `BEGINFILE' rule, not inside an `ENDFILE' rule. The `getline' statement (*note Getline::) is restricted inside both `BEGINFILE' and `ENDFILE': only redirected forms of `getline' are @@ -9398,11 +9477,11 @@ following program: END { print nmatches, "found" }' /path/to/data The `awk' program consists of two pieces of quoted text that are -concatenated together to form the program. The first part is double -quoted, which allows substitution of the `pattern' shell variable -inside the quotes. The second part is single quoted. +concatenated together to form the program. The first part is +double-quoted, which allows substitution of the `pattern' shell +variable inside the quotes. The second part is single-quoted. - Variable substitution via quoting works, but can be potentially + Variable substitution via quoting works, but can potentially be messy. It requires a good understanding of the shell's quoting rules (*note Quoting::), and it's often difficult to correctly match up the quotes when reading the program. @@ -9599,15 +9678,15 @@ The body of this loop is a compound statement enclosed in braces, containing two statements. The loop works in the following manner: first, the value of `i' is set to one. Then, the `while' statement tests whether `i' is less than or equal to three. This is true when -`i' equals one, so the `i'-th field is printed. Then the `i++' +`i' equals one, so the `i'th field is printed. Then the `i++' increments the value of `i' and the loop repeats. The loop terminates when `i' reaches four. A newline is not required between the condition and the body; however, using one makes the program clearer unless the body is a -compound statement or else is very simple. The newline after the -open-brace that begins the compound statement is not required either, -but the program is harder to read without it. +compound statement or else is very simple. The newline after the open +brace that begins the compound statement is not required either, but the +program is harder to read without it. File: gawk.info, Node: Do Statement, Next: For Statement, Prev: While Statement, Up: Statements @@ -9630,7 +9709,7 @@ Contrast this with the corresponding `while' statement: while (CONDITION) BODY -This statement does not execute BODY even once if the CONDITION is +This statement does not execute the BODY even once if the CONDITION is false to begin with. The following is an example of a `do' statement: { @@ -9686,7 +9765,7 @@ loop.) The same is true of the INCREMENT part. Incrementing additional variables requires separate statements at the end of the loop. The C compound expression, using C's comma operator, is useful in this -context but it is not supported in `awk'. +context, but it is not supported in `awk'. Most often, INCREMENT is an increment expression, as in the previous example. But this is not required; it can be any expression @@ -9762,7 +9841,7 @@ statement looks like this: Control flow in the `switch' statement works as it does in C. Once a match to a given case is made, the case statement bodies execute until -a `break', `continue', `next', `nextfile' or `exit' is encountered, or +a `break', `continue', `next', `nextfile', or `exit' is encountered, or the end of the `switch' statement itself. For example: while ((c = getopt(ARGC, ARGV, "aksx")) != -1) { @@ -9809,12 +9888,12 @@ divisor of any integer, and also identifies prime numbers: # find smallest divisor of num { num = $1 - for (div = 2; div * div <= num; div++) { - if (num % div == 0) + for (divisor = 2; divisor * divisor <= num; divisor++) { + if (num % divisor == 0) break } - if (num % div == 0) - printf "Smallest divisor of %d is %d\n", num, div + if (num % divisor == 0) + printf "Smallest divisor of %d is %d\n", num, divisor else printf "%d is prime\n", num } @@ -9832,12 +9911,12 @@ Statement::.) # find smallest divisor of num { num = $1 - for (div = 2; ; div++) { - if (num % div == 0) { - printf "Smallest divisor of %d is %d\n", num, div + for (divisor = 2; ; divisor++) { + if (num % divisor == 0) { + printf "Smallest divisor of %d is %d\n", num, divisor break } - if (div * div > num) { + if (divisor * divisor > num) { printf "%d is prime\n", num break } @@ -10005,12 +10084,11 @@ listed in `ARGV'. standard. See the Austin Group website (http://austingroupbugs.net/view.php?id=607). - The current version of BWK `awk', and `mawk' also support -`nextfile'. However, they don't allow the `nextfile' statement inside -function bodies (*note User-defined::). `gawk' does; a `nextfile' -inside a function body reads the next record and starts processing it -with the first rule in the program, just as any other `nextfile' -statement. + The current version of BWK `awk' and `mawk' also support `nextfile'. +However, they don't allow the `nextfile' statement inside function +bodies (*note User-defined::). `gawk' does; a `nextfile' inside a +function body reads the next record and starts processing it with the +first rule in the program, just as any other `nextfile' statement. File: gawk.info, Node: Exit Statement, Prev: Nextfile Statement, Up: Statements @@ -10038,9 +10116,9 @@ record, skips reading any remaining input records, and executes the they do not execute. In such a case, if you don't want the `END' rule to do its job, set -a variable to nonzero before the `exit' statement and check that -variable in the `END' rule. *Note Assert Function::, for an example -that does this. +a variable to a nonzero value before the `exit' statement and check +that variable in the `END' rule. *Note Assert Function::, for an +example that does this. If an argument is supplied to `exit', its value is used as the exit status code for the `awk' process. If no argument is supplied, `exit' @@ -10098,7 +10176,7 @@ of activity. File: gawk.info, Node: User-modified, Next: Auto-set, Up: Built-in Variables -7.5.1 Built-In Variables That Control `awk' +7.5.1 Built-in Variables That Control `awk' ------------------------------------------- The following is an alphabetical list of variables that you can change @@ -10122,11 +10200,11 @@ description of each variable.) use binary I/O. Any other string value is treated the same as `"rw"', but causes `gawk' to generate a warning message. `BINMODE' is described in more detail in *note PC Using::. `mawk' - (*note Other Versions::), also supports this variable, but only + (*note Other Versions::) also supports this variable, but only using numeric values. ``CONVFMT'' - This string controls conversion of numbers to strings (*note + A string that controls the conversion of numbers to strings (*note Conversion::). It works by being passed, in effect, as the first argument to the `sprintf()' function (*note String Functions::). Its default value is `"%.6g"'. `CONVFMT' was introduced by the @@ -10173,15 +10251,14 @@ description of each variable.) `IGNORECASE #' If `IGNORECASE' is nonzero or non-null, then all string comparisons - and all regular expression matching are case independent. Thus, - regexp matching with `~' and `!~', as well as the `gensub()', + and all regular expression matching are case-independent. This + applies to regexp matching with `~' and `!~', the `gensub()', `gsub()', `index()', `match()', `patsplit()', `split()', and `sub()' functions, record termination with `RS', and field - splitting with `FS' and `FPAT', all ignore case when doing their - particular regexp operations. However, the value of `IGNORECASE' - does _not_ affect array subscripting and it does not affect field - splitting when using a single-character field separator. *Note - Case-sensitivity::. + splitting with `FS' and `FPAT'. However, the value of + `IGNORECASE' does _not_ affect array subscripting and it does not + affect field splitting when using a single-character field + separator. *Note Case-sensitivity::. `LINT #' When this variable is true (nonzero or non-null), `gawk' behaves @@ -10193,7 +10270,7 @@ description of each variable.) Assigning a false value to `LINT' turns off the lint warnings. This variable is a `gawk' extension. It is not special in other - `awk' implementations. Unlike the other special variables, + `awk' implementations. Unlike with the other special variables, changing `LINT' does affect the production of lint warnings, even if `gawk' is in compatibility mode. Much as the `--lint' and `--traditional' options independently control different aspects of @@ -10201,17 +10278,18 @@ description of each variable.) execution is independent of the flavor of `awk' being executed. `OFMT' - Controls conversion of numbers to strings (*note Conversion::) for - printing with the `print' statement. It works by being passed as - the first argument to the `sprintf()' function (*note String - Functions::). Its default value is `"%.6g"'. Earlier versions of - `awk' used `OFMT' to specify the format for converting numbers to - strings in general expressions; this is now done by `CONVFMT'. + A string that controls conversion of numbers to strings (*note + Conversion::) for printing with the `print' statement. It works + by being passed as the first argument to the `sprintf()' function + (*note String Functions::). Its default value is `"%.6g"'. + Earlier versions of `awk' used `OFMT' to specify the format for + converting numbers to strings in general expressions; this is now + done by `CONVFMT'. `OFS' - This is the output field separator (*note Output Separators::). - It is output between the fields printed by a `print' statement. - Its default value is `" "', a string consisting of a single space. + The output field separator (*note Output Separators::). It is + output between the fields printed by a `print' statement. Its + default value is `" "', a string consisting of a single space. `ORS' The output record separator. It is output at the end of every @@ -10261,7 +10339,7 @@ description of each variable.) File: gawk.info, Node: Auto-set, Next: ARGC and ARGV, Prev: User-modified, Up: Built-in Variables -7.5.2 Built-In Variables That Convey Information +7.5.2 Built-in Variables That Convey Information ------------------------------------------------ The following is an alphabetical list of variables that `awk' sets @@ -10379,14 +10457,14 @@ Options::), they are not special: `NF' The number of fields in the current input record. `NF' is set - each time a new record is read, when a new field is created or + each time a new record is read, when a new field is created, or when `$0' changes (*note Fields::). Unlike most of the variables described in this node, assigning a value to `NF' has the potential to affect `awk''s internal workings. In particular, assignments to `NF' can be used to - create or remove fields from the current record. *Note Changing - Fields::. + create fields in or remove fields from the current record. *Note + Changing Fields::. `FUNCTAB #' An array whose indices and corresponding values are the names of @@ -10421,7 +10499,7 @@ Options::), they are not special: `PROCINFO["identifiers"]' A subarray, indexed by the names of all identifiers used in - the text of the AWK program. An "identifier" is simply the + the text of the `awk' program. An "identifier" is simply the name of a variable (be it scalar or array), built-in function, user-defined function, or extension function. For each identifier, the value of the element is one of the @@ -10442,7 +10520,7 @@ Options::), they are not special: `"untyped"' The identifier is untyped (could be used as a scalar or - array, `gawk' doesn't know yet). + an array; `gawk' doesn't know yet). `"user"' The identifier is a user-defined function. @@ -10531,7 +10609,7 @@ Options::), they are not special: string, or -1 if no match is found. `RSTART' - The start-index in characters of the substring that is matched by + The start index in characters of the substring that is matched by the `match()' function (*note String Functions::). `RSTART' is set by invoking the `match()' function. Its value is the position of the string where the matched substring starts, or zero if no @@ -10581,7 +10659,7 @@ Options::), they are not special: } NOTE: In order to avoid severe time-travel paradoxes,(2) - neither `FUNCTAB' nor `SYMTAB' are available as elements + neither `FUNCTAB' nor `SYMTAB' is available as an element within the `SYMTAB' array. Changing `NR' and `FNR' @@ -10720,7 +10798,7 @@ are passed on to the `awk' program. (*Note Getopt Function::, for an When designing your program, you should choose options that don't conflict with `gawk''s, because it will process any options that it accepts before passing the rest of the command line on to your program. -Using `#!' with the `-E' option may help (*Note Executable Scripts::, +Using `#!' with the `-E' option may help (*note Executable Scripts::, and *note Options::,). @@ -10731,14 +10809,14 @@ File: gawk.info, Node: Pattern Action Summary, Prev: Built-in Variables, Up: * Pattern-action pairs make up the basic elements of an `awk' program. Patterns are either normal expressions, range - expressions, regexp constants, one of the special keywords - `BEGIN', `END', `BEGINFILE', `ENDFILE', or empty. The action + expressions, or regexp constants; one of the special keywords + `BEGIN', `END', `BEGINFILE', or `ENDFILE'; or empty. The action executes if the current record matches the pattern. Empty (missing) patterns match all records. - * I/O from `BEGIN' and `END' rules have certain constraints. This - is also true, only more so, for `BEGINFILE' and `ENDFILE' rules. - The latter two give you "hooks" into `gawk''s file processing, + * I/O from `BEGIN' and `END' rules has certain constraints. This is + also true, only more so, for `BEGINFILE' and `ENDFILE' rules. The + latter two give you "hooks" into `gawk''s file processing, allowing you to recover from a file that otherwise would cause a fatal error (such as a file that cannot be opened). @@ -10759,11 +10837,11 @@ File: gawk.info, Node: Pattern Action Summary, Prev: Built-in Variables, Up: iteration of a loop (or get out of a `switch'). * `next' and `nextfile' let you read the next record and start over - at the top of your program, or skip to the next input file and + at the top of your program or skip to the next input file and start over, respectively. * The `exit' statement terminates your program. When executed from - an action (or function body) it transfers control to the `END' + an action (or function body), it transfers control to the `END' statements. From an `END' statement body, it exits immediately. You may pass an optional numeric value to be used as `awk''s exit status. @@ -10855,7 +10933,7 @@ be used as an array index. including a specification of how many elements or components they contain. In such languages, the declaration causes a contiguous block of memory to be allocated for that many elements. Usually, an index in -the array must be a positive integer. For example, the index zero +the array must be a nonnegative integer. For example, the index zero specifies the first element in the array, which is actually stored at the beginning of the block of memory. Index one specifies the second element, which is stored in memory right after the first element, and @@ -10865,9 +10943,9 @@ languages allow arbitrary starting and ending indices--e.g., `15 .. 27'--but the size of the array is still fixed when the array is declared.) - A contiguous array of four elements might look like the following -example, conceptually, if the element values are 8, `"foo"', `""', and -30 as shown in *note figure-array-elements::: + A contiguous array of four elements might look like *note +figure-array-elements::, conceptually, if the element values are eight, +`"foo"', `""', and 30. +---------+---------+--------+---------+ | 8 | "foo" | "" | 30 | @r{Value} @@ -10876,17 +10954,19 @@ example, conceptually, if the element values are 8, `"foo"', `""', and Figure 8.1: A contiguous array Only the values are stored; the indices are implicit from the order of -the values. Here, 8 is the value at index zero, because 8 appears in the -position with zero elements before it. +the values. Here, eight is the value at index zero, because eight +appears in the position with zero elements before it. Arrays in `awk' are different--they are "associative". This means that each array is a collection of pairs--an index and its corresponding array element value: - Index 3 Value 30 - Index 1 Value "foo" - Index 0 Value 8 - Index 2 Value "" + Index Value +------------------------ + `3' `30' + `1' `"foo"' + `0' `8' + `2' `""' The pairs are shown in jumbled order because their order is irrelevant.(1) @@ -10895,32 +10975,36 @@ irrelevant.(1) at any time. For example, suppose a tenth element is added to the array whose value is `"number ten"'. The result is: - Index 10 Value "number ten" - Index 3 Value 30 - Index 1 Value "foo" - Index 0 Value 8 - Index 2 Value "" + Index Value +------------------------------- + `10' `"number ten"' + `3' `30' + `1' `"foo"' + `0' `8' + `2' `""' Now the array is "sparse", which just means some indices are missing. It has elements 0-3 and 10, but doesn't have elements 4, 5, 6, 7, 8, or 9. Another consequence of associative arrays is that the indices don't -have to be positive integers. Any number, or even a string, can be an -index. For example, the following is an array that translates words +have to be nonnegative integers. Any number, or even a string, can be +an index. For example, the following is an array that translates words from English to French: - Index "dog" Value "chien" - Index "cat" Value "chat" - Index "one" Value "un" - Index 1 Value "un" + Index Value +------------------------ + `"dog"' `"chien"' + `"cat"' `"chat"' + `"one"' `"un"' + `1' `"un"' Here we decided to translate the number one in both spelled-out and numeric form--thus illustrating that a single array can have both numbers and strings as indices. (In fact, array subscripts are always strings. There are some subtleties to how numbers work when used as array subscripts; this is discussed in more detail in *note Numeric -Array Subscripts::.) Here, the number `1' isn't double quoted, because +Array Subscripts::.) Here, the number `1' isn't double-quoted, because `awk' automatically converts it to a string. The value of `IGNORECASE' has no effect upon array subscripting. @@ -10944,7 +11028,7 @@ File: gawk.info, Node: Reference to Elements, Next: Assigning Elements, Prev: ----------------------------------- The principal way to use an array is to refer to one of its elements. -An array reference is an expression as follows: +An "array reference" is an expression as follows: ARRAY[INDEX-EXPRESSION] @@ -10952,8 +11036,8 @@ Here, ARRAY is the name of an array. The expression INDEX-EXPRESSION is the index of the desired element of the array. The value of the array reference is the current value of that array -element. For example, `foo[4.3]' is an expression for the element of -array `foo' at index `4.3'. +element. For example, `foo[4.3]' is an expression referencing the +element of array `foo' at index `4.3'. A reference to an array element that has no recorded value yields a value of `""', the null string. This includes elements that have not @@ -11020,7 +11104,7 @@ File: gawk.info, Node: Array Example, Next: Scanning an Array, Prev: Assignin The following program takes a list of lines, each beginning with a line number, and prints them out in order of line number. The line numbers -are not in order when they are first read--instead they are scrambled. +are not in order when they are first read--instead, they are scrambled. This program sorts the lines by making an array using the line numbers as subscripts. The program then prints out the lines in sorted order of their numbers. It is a very simple program and gets confused upon @@ -11076,7 +11160,7 @@ File: gawk.info, Node: Scanning an Array, Next: Controlling Scanning, Prev: A In programs that use arrays, it is often necessary to use a loop that executes once for each element of an array. In other languages, where -arrays are contiguous and indices are limited to positive integers, +arrays are contiguous and indices are limited to nonnegative integers, this is easy: all the valid indices can be found by counting from the lowest index up to the highest. This technique won't do the job in `awk', because any number or string can be an array index. So `awk' @@ -11091,7 +11175,7 @@ has previously used, with the variable VAR set to that index. The following program uses this form of the `for' statement. The first rule scans the input records and notes which words appear (at least once) in the input, by storing a one into the array `used' with -the word as index. The second rule scans the elements of `used' to +the word as the index. The second rule scans the elements of `used' to find all the distinct words that appear in the input. It prints each word that is more than 10 characters long and also prints the number of such words. *Note String Functions::, for more information on the @@ -11174,7 +11258,7 @@ internal implementation of arrays and will vary from one version of Often, though, you may wish to do something simple, such as "traverse the array by comparing the indices in ascending order," or "traverse the array by comparing the values in descending order." -`gawk' provides two mechanisms which give you this control. +`gawk' provides two mechanisms that give you this control: * Set `PROCINFO["sorted_in"]' to one of a set of predefined values. We describe this now. @@ -11222,22 +11306,26 @@ available: which `gawk' uses internally to perform the sorting. `"@ind_str_desc"' - String indices ordered from high to low. + Like `"@ind_str_asc"', but the string indices are ordered from + high to low. `"@ind_num_desc"' - Numeric indices ordered from high to low. + Like `"@ind_num_asc"', but the numeric indices are ordered from + high to low. `"@val_type_desc"' - Element values, based on type, ordered from high to low. - Subarrays, if present, come out first. + Like `"@val_type_asc"', but the element values, based on type, are + ordered from high to low. Subarrays, if present, come out first. `"@val_str_desc"' - Element values, treated as strings, ordered from high to low. - Subarrays, if present, come out first. + Like `"@val_str_asc"', but the element values, treated as strings, + are ordered from high to low. Subarrays, if present, come out + first. `"@val_num_desc"' - Element values, treated as numbers, ordered from high to low. - Subarrays, if present, come out first. + Like `"@val_num_asc"', but the element values, treated as numbers, + are ordered from high to low. Subarrays, if present, come out + first. The array traversal order is determined before the `for' loop starts to run. Changing `PROCINFO["sorted_in"]' in the loop body does not @@ -11423,8 +11511,8 @@ deleting elements in an array: This example removes all the elements from the array `frequencies'. Once an element is deleted, a subsequent `for' statement to scan the -array does not report that element and the `in' operator to check for -the presence of that element returns zero (i.e., false): +array does not report that element and using the `in' operator to check +for the presence of that element returns zero (i.e., false): delete foo[4] if (4 in foo) @@ -11627,7 +11715,7 @@ two-element subarray at index `1' of the main array `a': This simulates a true two-dimensional array. Each subarray element can contain another subarray as a value, which in turn can hold other arrays as well. In this way, you can create arrays of three or more -dimensions. The indices can be any `awk' expression, including scalars +dimensions. The indices can be any `awk' expressions, including scalars separated by commas (i.e., a regular `awk' simulated multidimensional subscript). So the following is valid in `gawk': @@ -11636,7 +11724,7 @@ subscript). So the following is valid in `gawk': Each subarray and the main array can be of different length. In fact, the elements of an array or its subarray do not all have to have the same type. This means that the main array and any of its subarrays -can be non-rectangular, or jagged in structure. You can assign a scalar +can be nonrectangular, or jagged in structure. You can assign a scalar value to the index `4' of the main array `a', even though `a[1]' is itself an array and not a scalar: @@ -11654,8 +11742,8 @@ the element at that index: a[4][5][6][7] = "An element in a four-dimensional array" This removes the scalar value from index `4' and then inserts a -subarray of subarray of subarray containing a scalar. You can also -delete an entire subarray or subarray of subarrays: +three-level nested subarray containing a scalar. You can also delete an +entire subarray or subarray of subarrays: delete a[4][5] a[4][5] = "An element in subarray a[4]" @@ -11663,7 +11751,7 @@ delete an entire subarray or subarray of subarrays: But recall that you can not delete the main array `a' and then use it as a scalar. - The built-in functions which take array arguments can also be used + The built-in functions that take array arguments can also be used with subarrays. For example, the following code fragment uses `length()' (*note String Functions::) to determine the number of elements in the main array `a' and its subarrays: @@ -11684,7 +11772,7 @@ be nested to scan all the elements of an array of arrays if it is rectangular in structure. In order to print the contents (scalar values) of a two-dimensional array of arrays (i.e., in which each first-level element is itself an array, not necessarily of the same -length) you could use the following code: +length), you could use the following code: for (i in array) for (j in array[i]) @@ -11766,9 +11854,9 @@ File: gawk.info, Node: Arrays Summary, Prev: Arrays of Arrays, Up: Arrays of `awk'. * Standard `awk' simulates multidimensional arrays by separating - subscript values with a comma. The values are concatenated into a + subscript values with commas. The values are concatenated into a single string, separated by the value of `SUBSEP'. The fact that - such a subscript was created in this way is not retained; thus + such a subscript was created in this way is not retained; thus, changing `SUBSEP' may have unexpected consequences. You can use `(SUB1, SUB2, ...) in ARRAY' to see if such a multidimensional subscript exists in ARRAY. @@ -11776,7 +11864,7 @@ File: gawk.info, Node: Arrays Summary, Prev: Arrays of Arrays, Up: Arrays * `gawk' provides true arrays of arrays. You use a separate set of square brackets for each dimension in such an array: `data[row][col]', for example. Array elements may thus be either - scalar values (number or string) or another array. + scalar values (number or string) or other arrays. * Use the `isarray()' built-in function to determine if an array element is itself a subarray. @@ -11796,7 +11884,9 @@ internationalize and localize programs. Besides the built-in functions, `awk' has provisions for writing new functions that the rest of a program can use. The second half of this -major node describes these "user-defined" functions. +major node describes these "user-defined" functions. Finally, we +explore indirect function calls, a `gawk'-specific extension that lets +you determine at runtime what function is to be called. * Menu: @@ -11808,7 +11898,7 @@ major node describes these "user-defined" functions. File: gawk.info, Node: Built-in, Next: User-defined, Up: Functions -9.1 Built-In Functions +9.1 Built-in Functions ====================== "Built-in" functions are always available for your `awk' program to @@ -11833,7 +11923,7 @@ for your convenience. File: gawk.info, Node: Calling Built-in, Next: Numeric Functions, Up: Built-in -9.1.1 Calling Built-In Functions +9.1.1 Calling Built-in Functions -------------------------------- To call one of `awk''s built-in functions, write the name of the @@ -11870,9 +11960,10 @@ are evaluated from left to right or from right to left. For example: j = atan2(++i, i *= 2) If the order of evaluation is left to right, then `i' first becomes -6, and then 12, and `atan2()' is called with the two arguments 6 and -12. But if the order of evaluation is right to left, `i' first becomes -10, then 11, and `atan2()' is called with the two arguments 11 and 10. +six, and then 12, and `atan2()' is called with the two arguments six +and 12. But if the order of evaluation is right to left, `i' first +becomes 10, then 11, and `atan2()' is called with the two arguments 11 +and 10. File: gawk.info, Node: Numeric Functions, Next: String Functions, Prev: Calling Built-in, Up: Built-in @@ -11928,7 +12019,7 @@ brackets ([ ]): Often random integers are needed instead. Following is a user-defined function that can be used to obtain a random - non-negative integer less than N: + nonnegative integer less than N: function randint(n) { @@ -12005,7 +12096,7 @@ numbers. (2) `mawk' uses a different seed each time. (3) Computer-generated random numbers really are not truly random. -They are technically known as "pseudorandom." This means that although +They are technically known as "pseudorandom". This means that although the numbers in a sequence appear to be random, you can in fact generate the same sequence of random numbers over and over again. @@ -12018,7 +12109,7 @@ File: gawk.info, Node: String Functions, Next: I/O Functions, Prev: Numeric F The functions in this minor node look at or change the text of one or more strings. - `gawk' understands locales (*note Locales::), and does all string + `gawk' understands locales (*note Locales::) and does all string processing in terms of _characters_, not _bytes_. This distinction is particularly important to understand for locales where one character may be represented by multiple bytes. Thus, for example, `length()' @@ -12089,7 +12180,7 @@ Options::): a[2] = "de" a[3] = "sac" - The `asorti()' function works similarly to `asort()', however, the + The `asorti()' function works similarly to `asort()'; however, the _indices_ are sorted, instead of the values. Thus, in the previous example, starting with the same initial set of indices and values in `a', calling `asorti(a)' would yield: @@ -12177,7 +12268,7 @@ Options::): With BWK `awk' and `gawk', it is a fatal error to use a regexp constant for FIND. Other implementations allow it, simply treating the regexp constant as an expression meaning `$0 ~ - /regexp/'. (d.c.). + /regexp/'. (d.c.) `length('[STRING]`)' Return the number of characters in STRING. If STRING is a number, @@ -12221,9 +12312,9 @@ Options::): `match(STRING, REGEXP' [`, ARRAY']`)' Search STRING for the longest, leftmost substring matched by the - regular expression, REGEXP and return the character position - (index) at which that substring begins (one, if it starts at the - beginning of STRING). If no match is found, return zero. + regular expression REGEXP and return the character position (index) + at which that substring begins (one, if it starts at the beginning + of STRING). If no match is found, return zero. The REGEXP argument may be either a regexp constant (`/'...`/') or a string constant (`"'...`"'). In the latter case, the string is @@ -12231,7 +12322,7 @@ Options::): discussion of the difference between the two forms, and the implications for writing your program correctly. - The order of the first two arguments is backwards from most other + The order of the first two arguments is the opposite of most other string functions that work with regular expressions, such as `sub()' and `gsub()'. It might help to remember that for `match()', the order is the same as for the `~' operator: `STRING @@ -12298,8 +12389,8 @@ Options::): There may not be subscripts for the start and index for every parenthesized subexpression, because they may not all have matched - text; thus they should be tested for with the `in' operator (*note - Reference to Elements::). + text; thus, they should be tested for with the `in' operator + (*note Reference to Elements::). The ARRAY argument to `match()' is a `gawk' extension. In compatibility mode (*note Options::), using a third argument is a @@ -12332,19 +12423,19 @@ Options::): FIELDSEP, is a regexp describing where to split STRING (much as `FS' can be a regexp describing where to split input records). If FIELDSEP is omitted, the value of `FS' is used. `split()' returns - the number of elements created. SEPS is a `gawk' extension with + the number of elements created. SEPS is a `gawk' extension, with `SEPS[I]' being the separator string between `ARRAY[I]' and - `ARRAY[I+1]'. If FIELDSEP is a single space then any leading + `ARRAY[I+1]'. If FIELDSEP is a single space, then any leading whitespace goes into `SEPS[0]' and any trailing whitespace goes - into `SEPS[N]' where N is the return value of `split()' (i.e., the - number of elements in ARRAY). + into `SEPS[N]', where N is the return value of `split()' (i.e., + the number of elements in ARRAY). The `split()' function splits strings into pieces in a manner similar to the way input lines are split into fields. For example: split("cul-de-sac", a, "-", seps) - splits the string `cul-de-sac' into three fields using `-' as the + splits the string `"cul-de-sac"' into three fields using `-' as the separator. It sets the contents of the array `a' as follows: a[1] = "cul" @@ -12361,17 +12452,18 @@ Options::): As with input field-splitting, when the value of FIELDSEP is `" "', leading and trailing whitespace is ignored in values assigned to the elements of ARRAY but not in SEPS, and the elements - are separated by runs of whitespace. Also, as with input - field-splitting, if FIELDSEP is the null string, each individual + are separated by runs of whitespace. Also, as with input field + splitting, if FIELDSEP is the null string, each individual character in the string is split into its own array element. (c.e.) Note, however, that `RS' has no effect on the way `split()' works. - Even though `RS = ""' causes newline to also be an input field - separator, this does not affect how `split()' splits strings. + Even though `RS = ""' causes the newline character to also be an + input field separator, this does not affect how `split()' splits + strings. Modern implementations of `awk', including `gawk', allow the third - argument to be a regexp constant (`/abc/') as well as a string. + argument to be a regexp constant (`/'...`/') as well as a string. (d.c.) The POSIX standard allows this as well. *Note Computed Regexps::, for a discussion of the difference between using a string constant or a regexp constant, and the implications for @@ -12472,7 +12564,7 @@ Options::): { sub(/\|/, "\\&"); print } As mentioned, the third argument to `sub()' must be a variable, - field or array element. Some versions of `awk' allow the third + field, or array element. Some versions of `awk' allow the third argument to be an expression that is not an lvalue. In such a case, `sub()' still searches for the pattern and returns zero or one, but the result of the substitution (if any) is thrown away @@ -12597,11 +12689,11 @@ example, `"a\qb"' is treated as `"aqb"'. At the runtime level, the various functions handle sequences of `\' and `&' differently. The situation is (sadly) somewhat complex. -Historically, the `sub()' and `gsub()' functions treated the two -character sequence `\&' specially; this sequence was replaced in the -generated text with a single `&'. Any other `\' within the REPLACEMENT -string that did not precede an `&' was passed through unchanged. This -is illustrated in *note table-sub-escapes::. +Historically, the `sub()' and `gsub()' functions treated the +two-character sequence `\&' specially; this sequence was replaced in +the generated text with a single `&'. Any other `\' within the +REPLACEMENT string that did not precede an `&' was passed through +unchanged. This is illustrated in *note table-sub-escapes::. You type `sub()' sees `sub()' generates ------- --------- -------------- @@ -12616,10 +12708,10 @@ is illustrated in *note table-sub-escapes::. Table 9.1: Historical escape sequence processing for `sub()' and `gsub()' -This table shows both the lexical-level processing, where an odd number -of backslashes becomes an even number at the runtime level, as well as -the runtime processing done by `sub()'. (For the sake of simplicity, -the rest of the following tables only show the case of even numbers of +This table shows the lexical-level processing, where an odd number of +backslashes becomes an even number at the runtime level, as well as the +runtime processing done by `sub()'. (For the sake of simplicity, the +rest of the following tables only show the case of even numbers of backslashes entered at the lexical level.) The problem with the historical approach is that there is no way to @@ -12643,10 +12735,10 @@ This is shown in *note table-sub-proposed::. `\\q' `\q' A literal `\q' `\\\\' `\\' `\\' -Table 9.2: GNU `awk' rules for `sub()' and backslash +Table 9.2: `gawk' rules for `sub()' and backslash In a nutshell, at the runtime level, there are now three special -sequences of characters (`\\\&', `\\&' and `\&') whereas historically +sequences of characters (`\\\&', `\\&', and `\&') whereas historically there was only one. However, as in the historical case, any `\' that is not part of one of these three sequences is not special and appears in the output literally. @@ -12676,7 +12768,7 @@ Table 9.3: POSIX rules for `sub()' and `gsub()' `\\\\' is seen as `\\' and produces `\' instead of `\\'. Starting with version 3.1.4, `gawk' followed the POSIX rules when -`--posix' is specified (*note Options::). Otherwise, it continued to +`--posix' was specified (*note Options::). Otherwise, it continued to follow the proposed rules, as that had been its behavior for many years. When version 4.0.0 was released, the `gawk' maintainer made the @@ -12703,9 +12795,9 @@ the `\' does not, as shown in *note table-gensub-escapes::. Table 9.4: Escape sequence processing for `gensub()' - Because of the complexity of the lexical and runtime level processing -and the special cases for `sub()' and `gsub()', we recommend the use of -`gawk' and `gensub()' when you have to do substitutions. + Because of the complexity of the lexical- and runtime-level +processing and the special cases for `sub()' and `gsub()', we recommend +the use of `gawk' and `gensub()' when you have to do substitutions. ---------- Footnotes ---------- @@ -12732,10 +12824,10 @@ parameters are enclosed in square brackets ([ ]): When closing a coprocess, it is occasionally useful to first close one end of the two-way pipe and then to close the other. This is done by providing a second argument to `close()'. This second - argument should be one of the two string values `"to"' or `"from"', - indicating which end of the pipe to close. Case in the string does - not matter. *Note Two-way I/O::, which discusses this feature in - more detail and gives an example. + argument (HOW) should be one of the two string values `"to"' or + `"from"', indicating which end of the pipe to close. Case in the + string does not matter. *Note Two-way I/O::, which discusses this + feature in more detail and gives an example. Note that the second argument to `close()' is a `gawk' extension; it is not available in compatibility mode (*note Options::). @@ -12753,7 +12845,7 @@ parameters are enclosed in square brackets ([ ]): sometimes it is necessary to force a program to "flush" its buffers (i.e., write the information to its destination, even if a buffer is not full). This is the purpose of the `fflush()' - function--`gawk' also buffers its output and the `fflush()' + function--`gawk' also buffers its output, and the `fflush()' function forces `gawk' to flush its buffers. Brian Kernighan added `fflush()' to his `awk' in April 1992. For @@ -12770,16 +12862,17 @@ parameters are enclosed in square brackets ([ ]): output files and pipes if the argument was the null string. This was changed in order to be compatible with Brian Kernighan's `awk', in the hope that standardizing this - feature in POSIX would then be easier (which indeed helped). + feature in POSIX would then be easier (which indeed proved to + be the case). With `gawk', you can use `fflush("/dev/stdout")' if you wish to flush only the standard output. `fflush()' returns zero if the buffer is successfully flushed; - otherwise, it returns non-zero. (`gawk' returns -1.) In the case - where all buffers are flushed, the return value is zero only if - all buffers were flushed successfully. Otherwise, it is -1, and - `gawk' warns about the problem FILENAME. + otherwise, it returns a nonzero value. (`gawk' returns -1.) In + the case where all buffers are flushed, the return value is zero + only if all buffers were flushed successfully. Otherwise, it is + -1, and `gawk' warns about the problem FILENAME. `gawk' also issues a warning message if you attempt to flush a file or pipe that was opened for reading (such as with `getline'), @@ -12788,9 +12881,9 @@ parameters are enclosed in square brackets ([ ]): Interactive Versus Noninteractive Buffering - As a side point, buffering issues can be even more confusing, - depending upon whether your program is "interactive" (i.e., - communicating with a user sitting at a keyboard).(1) + As a side point, buffering issues can be even more confusing if + your program is "interactive" (i.e., communicating with a user + sitting at a keyboard).(1) Interactive programs generally "line buffer" their output (i.e., they write out every line). Noninteractive programs wait until @@ -12819,7 +12912,7 @@ parameters are enclosed in square brackets ([ ]): shot. `system(COMMAND)' - Execute the operating-system command COMMAND and then return to + Execute the operating system command COMMAND and then return to the `awk' program. Return COMMAND's exit status. For example, if the following fragment of code is put in your `awk' @@ -12908,14 +13001,14 @@ File: gawk.info, Node: Time Functions, Next: Bitwise Functions, Prev: I/O Fun `awk' programs are commonly used to process log files containing timestamp information, indicating when a particular log record was -written. Many programs log their timestamp in the form returned by the -`time()' system call, which is the number of seconds since a particular -epoch. On POSIX-compliant systems, it is the number of seconds since -1970-01-01 00:00:00 UTC, not counting leap seconds.(1) All known -POSIX-compliant systems support timestamps from 0 through 2^31 - 1, -which is sufficient to represent times through 2038-01-19 03:14:07 UTC. -Many systems support a wider range of timestamps, including negative -timestamps that represent times before the epoch. +written. Many programs log their timestamps in the form returned by +the `time()' system call, which is the number of seconds since a +particular epoch. On POSIX-compliant systems, it is the number of +seconds since 1970-01-01 00:00:00 UTC, not counting leap seconds.(1) +All known POSIX-compliant systems support timestamps from 0 through +2^31 - 1, which is sufficient to represent times through 2038-01-19 +03:14:07 UTC. Many systems support a wider range of timestamps, +including negative timestamps that represent times before the epoch. In order to make it easier to process such log files and to produce useful reports, `gawk' provides the following functions for working @@ -12938,9 +13031,9 @@ enclosed in square brackets ([ ]): specified; for example, an hour of -1 means 1 hour before midnight. The origin-zero Gregorian calendar is assumed, with year 0 preceding year 1 and year -1 preceding year 0. The time is - assumed to be in the local timezone. If the daylight-savings flag - is positive, the time is assumed to be daylight savings time; if - zero, the time is assumed to be standard time; and if negative + assumed to be in the local time zone. If the daylight-savings + flag is positive, the time is assumed to be daylight savings time; + if zero, the time is assumed to be standard time; and if negative (the default), `mktime()' attempts to determine whether daylight savings time is in effect for the specified time. @@ -13081,23 +13174,23 @@ the following date format specifications: The weekday as a decimal number (1-7). Monday is day one. `%U' - The week number of the year (the first Sunday as the first day of - week one) as a decimal number (00-53). + The week number of the year (with the first Sunday as the first + day of week one) as a decimal number (00-53). `%V' - The week number of the year (the first Monday as the first day of - week one) as a decimal number (01-53). The method for determining - the week number is as specified by ISO 8601. (To wit: if the week - containing January 1 has four or more days in the new year, then - it is week one; otherwise it is week 53 of the previous year and - the next week is week one.) + The week number of the year (with the first Monday as the first + day of week one) as a decimal number (01-53). The method for + determining the week number is as specified by ISO 8601. (To wit: + if the week containing January 1 has four or more days in the new + year, then it is week one; otherwise it is week 53 of the previous + year and the next week is week one.) `%w' The weekday as a decimal number (0-6). Sunday is day zero. `%W' - The week number of the year (the first Monday as the first day of - week one) as a decimal number (00-53). + The week number of the year (with the first Monday as the first + day of week one) as a decimal number (00-53). `%x' The locale's "appropriate" date representation. (This is `%A %B @@ -13114,8 +13207,8 @@ the following date format specifications: The full year as a decimal number (e.g., 2015). `%z' - The timezone offset in a +HHMM format (e.g., the format necessary - to produce RFC 822/RFC 1036 date headers). + The time zone offset in a `+HHMM' format (e.g., the format + necessary to produce RFC 822/RFC 1036 date headers). `%Z' The time zone name or abbreviation; no characters if no time zone @@ -13232,7 +13325,7 @@ each successive pair of bits in the operands. Three common operations are bitwise AND, OR, and XOR. The operations are described in *note table-bitwise-ops::. - Bit Operator + Bit operator | AND | OR | XOR |--+--+--+--+--+-- Operands | 0 | 1 | 0 | 1 | 0 | 1 @@ -13288,7 +13381,7 @@ paragraph, don't worry about it.) Here is a user-defined function (*note User-defined::) that illustrates the use of these functions: - # bits2str --- turn a byte into readable 1's and 0's + # bits2str --- turn a byte into readable ones and zeros function bits2str(bits, data, mask) { @@ -13327,13 +13420,14 @@ This program produces the following output when run: -| lshift(0x99, 2) = 0x264 = 0000001001100100 -| rshift(0x99, 2) = 0x26 = 00100110 - The `bits2str()' function turns a binary number into a string. The -number `1' represents a binary value where the rightmost bit is set to -1. Using this mask, the function repeatedly checks the rightmost bit. -ANDing the mask with the value indicates whether the rightmost bit is 1 -or not. If so, a `"1"' is concatenated onto the front of the string. -Otherwise, a `"0"' is added. The value is then shifted right by one -bit and the loop continues until there are no more 1 bits. + The `bits2str()' function turns a binary number into a string. +Initializing `mask' to one creates a binary value where the rightmost +bit is set to one. Using this mask, the function repeatedly checks the +rightmost bit. ANDing the mask with the value indicates whether the +rightmost bit is one or not. If so, a `"1"' is concatenated onto the +front of the string. Otherwise, a `"0"' is added. The value is then +shifted right by one bit and the loop continues until there are no more +one bits. If the initial value is zero, it returns a simple `"0"'. Otherwise, at the end, it pads the value with zeros to represent multiples of @@ -13346,9 +13440,9 @@ Nondecimal-numbers::), and then demonstrates the results of the ---------- Footnotes ---------- - (1) This example shows that 0's come in on the left side. For + (1) This example shows that zeros come in on the left side. For `gawk', this is always true, but in some languages, it's possible to -have the left side fill with 1's. +have the left side fill with ones. File: gawk.info, Node: Type Functions, Next: I18N Functions, Prev: Bitwise Functions, Up: Built-in @@ -13362,7 +13456,7 @@ traverses every element of an array of arrays (*note Arrays of Arrays::). `isarray(X)' - Return a true value if X is an array. Otherwise return false. + Return a true value if X is an array. Otherwise, return false. `isarray()' is meant for use in two circumstances. The first is when traversing a multidimensional array: you can test if an element is @@ -13409,8 +13503,8 @@ brackets ([ ]): Return the plural form used for NUMBER of the translation of STRING1 and STRING2 in text domain DOMAIN for locale category CATEGORY. STRING1 is the English singular variant of a message, - and STRING2 the English plural variant of the same message. The - default value for DOMAIN is the current value of `TEXTDOMAIN'. + and STRING2 is the English plural variant of the same message. + The default value for DOMAIN is the current value of `TEXTDOMAIN'. The default value for CATEGORY is `"LC_MESSAGES"'. @@ -13439,7 +13533,7 @@ File: gawk.info, Node: Definition Syntax, Next: Function Example, Up: User-de 9.2.1 Function Definition Syntax -------------------------------- - It's entirely fair to say that the `awk' syntax for local variable + It's entirely fair to say that the awk syntax for local variable definitions is appallingly awful. -- Brian Kernighan Definitions of functions can appear anywhere between the rules of an @@ -13469,17 +13563,22 @@ the argument names are used to hold the argument values given in the call. A function cannot have two parameters with the same name, nor may it -have a parameter with the same name as the function itself. In -addition, according to the POSIX standard, function parameters cannot -have the same name as one of the special predefined variables (*note -Built-in Variables::). Not all versions of `awk' enforce this -restriction. +have a parameter with the same name as the function itself. + + CAUTION: According to the POSIX standard, function parameters + cannot have the same name as one of the special predefined + variables (*note Built-in Variables::), nor may a function + parameter have the same name as another function. + + Not all versions of `awk' enforce these restrictions. `gawk' + always enforces the first restriction. With `--posix' (*note + Options::), it also enforces the second restriction. Local variables act like the empty string if referenced where a string value is required, and like zero if referenced where a numeric -value is required. This is the same as regular variables that have -never been assigned a value. (There is more to understand about local -variables; *note Dynamic Typing::.) +value is required. This is the same as the behavior of regular +variables that have never been assigned a value. (There is more to +understand about local variables; *note Dynamic Typing::.) The BODY-OF-FUNCTION consists of `awk' statements. It is the most important part of the definition, because it says what the function @@ -13508,9 +13607,9 @@ function is supposed to be used. variable values hide, or "shadow", any variables of the same names used in the rest of the program. The shadowed variables are not accessible in the function definition, because there is no way to name them while -their names have been taken away for the local variables. All other -variables used in the `awk' program can be referenced or set normally -in the function's body. +their names have been taken away for the arguments and local variables. +All other variables used in the `awk' program can be referenced or set +normally in the function's body. The arguments and local variables last only as long as the function body is executing. Once the body finishes, you can once again access @@ -13563,7 +13662,7 @@ takes a number and prints it in a specific format: printf "%6.3g\n", num } -To illustrate, here is an `awk' rule that uses our `myprint' function: +To illustrate, here is an `awk' rule that uses our `myprint()' function: $3 > 0 { myprint($3) } @@ -13592,13 +13691,13 @@ extra whitespace signifies the start of the local variable list): When working with arrays, it is often necessary to delete all the elements in an array and start over with a new list of elements (*note Delete::). Instead of having to repeat this loop everywhere that you -need to clear out an array, your program can just call `delarray'. +need to clear out an array, your program can just call `delarray()'. (This guarantees portability. The use of `delete ARRAY' to delete the contents of an entire array is a relatively recent(1) addition to the POSIX standard.) The following is an example of a recursive function. It takes a -string as an input parameter and returns the string in backwards order. +string as an input parameter and returns the string in reverse order. Recursive functions must always have a test that stops the recursion. In this case, the recursion terminates when the input string is already empty: @@ -13689,14 +13788,14 @@ File: gawk.info, Node: Variable Scope, Next: Pass By Value/Reference, Prev: C 9.2.3.2 Controlling Variable Scope .................................. -Unlike many languages, there is no way to make a variable local to a +Unlike in many languages, there is no way to make a variable local to a `{' ... `}' block in `awk', but you can make a variable local to a function. It is good practice to do so whenever a variable is needed only in that function. To make a variable local to a function, simply declare the variable as an argument after the actual function arguments (*note Definition -Syntax::). Look at the following example where variable `i' is a +Syntax::). Look at the following example, where variable `i' is a global variable used by both functions `foo()' and `bar()': function bar() @@ -13732,7 +13831,7 @@ variable instance: foo's i=3 top's i=3 - If you want `i' to be local to both `foo()' and `bar()' do as + If you want `i' to be local to both `foo()' and `bar()', do as follows (the extra space before `i' is a coding convention to indicate that `i' is a local variable, not an argument): @@ -13814,7 +13913,7 @@ explicitly whether the arguments are passed "by value" or "by reference". Instead, the passing convention is determined at runtime when the -function is called according to the following rule: if the argument is +function is called, according to the following rule: if the argument is an array variable, then it is passed by reference. Otherwise, the argument is passed by value. @@ -13872,7 +13971,7 @@ function _are_ visible outside that function. stores `"two"' in the second element of `a'. Some `awk' implementations allow you to call a function that has not -been defined. They only report a problem at runtime when the program +been defined. They only report a problem at runtime, when the program actually tries to call the function. For example: BEGIN { @@ -13917,15 +14016,15 @@ undefined, and therefore, unpredictable. In practice, though, all versions of `awk' simply return the null string, which acts like zero if used in a numeric context. - A `return' statement with no value expression is assumed at the end -of every function definition. So if control reaches the end of the -function body, then technically, the function returns an unpredictable + A `return' statement without an EXPRESSION is assumed at the end of +every function definition. So, if control reaches the end of the +function body, then technically the function returns an unpredictable value. In practice, it returns the empty string. `awk' does _not_ warn you if you use the return value of such a function. Sometimes, you want to write a function for what it does, not for what it returns. Such a function corresponds to a `void' function in -C, C++ or Java, or to a `procedure' in Ada. Thus, it may be +C, C++, or Java, or to a `procedure' in Ada. Thus, it may be appropriate to not return any value; simply bear in mind that you should not be using the return value of such a function. @@ -14031,13 +14130,13 @@ you can specify the name of the function to call as a string variable, and then call the function. Let's look at an example. Suppose you have a file with your test scores for the classes you -are taking. The first field is the class name. The following fields -are the functions to call to process the data, up to a "marker" field +are taking, and you wish to get the sum and the average of your test +scores. The first field is the class name. The following fields are +the functions to call to process the data, up to a "marker" field `data:'. Following the marker, to the end of the record, are the various numeric test scores. - Here is the initial file; you wish to get the sum and the average of -your test scores: + Here is the initial file: Biology_101 sum average data: 87.0 92.4 78.5 94.9 Chemistry_305 sum average data: 75.2 98.3 94.7 88.2 @@ -14095,9 +14194,9 @@ using indirect function calls: return ret } - These two functions expect to work on fields; thus the parameters + These two functions expect to work on fields; thus, the parameters `first' and `last' indicate where in the fields to start and end. -Otherwise they perform the expected computations and are not unusual: +Otherwise, they perform the expected computations and are not unusual: # For each record, print the class name and the requested statistics { @@ -14150,18 +14249,19 @@ to force it to be a string value.) may think at first. The C and C++ languages provide "function pointers," which are a mechanism for calling a function chosen at runtime. One of the most well-known uses of this ability is the C -`qsort()' function, which sorts an array using the famous "quick sort" +`qsort()' function, which sorts an array using the famous "quicksort" algorithm (see the Wikipedia article -(http://en.wikipedia.org/wiki/Quick_sort) for more information). To -use this function, you supply a pointer to a comparison function. This +(http://en.wikipedia.org/wiki/Quicksort) for more information). To use +this function, you supply a pointer to a comparison function. This mechanism allows you to sort arbitrary data in an arbitrary fashion. We can do something similar using `gawk', like this: # quicksort.awk --- Quicksort algorithm, with user-supplied # comparison function - # quicksort --- C.A.R. Hoare's quick sort algorithm. See Wikipedia - # or almost any algorithms or computer science text + + # quicksort --- C.A.R. Hoare's quicksort algorithm. See Wikipedia + # or almost any algorithms or computer science text. function quicksort(data, left, right, less_than, i, last) { @@ -14190,7 +14290,7 @@ mechanism allows you to sort arbitrary data in an arbitrary fashion. The `quicksort()' function receives the `data' array, the starting and ending indices to sort (`left' and `right'), and the name of a function that performs a "less than" comparison. It then implements -the quick sort algorithm. +the quicksort algorithm. To make use of the sorting function, we return to our previous example. The first thing to do is write some comparison functions: @@ -14284,61 +14384,7 @@ names of the two comparison functions: -| rsort: <100.0 95.6 93.4 87.1> Another example where indirect functions calls are useful can be -found in processing arrays. *note Walking Arrays::, presented a simple -function for "walking" an array of arrays. That function simply -printed the name and value of each scalar array element. However, it is -easy to generalize that function, by passing in the name of a function -to call when walking an array. The modified function looks like this: - - function process_array(arr, name, process, do_arrays, i, new_name) - { - for (i in arr) { - new_name = (name "[" i "]") - if (isarray(arr[i])) { - if (do_arrays) - @process(new_name, arr[i]) - process_array(arr[i], new_name, process, do_arrays) - } else - @process(new_name, arr[i]) - } - } - - The arguments are as follows: - -`arr' - The array. - -`name' - The name of the array (a string). - -`process' - The name of the function to call. - -`do_arrays' - If this is true, the function can handle elements that are - subarrays. - - If subarrays are to be processed, that is done before walking them -further. - - When run with the following scaffolding, the function produces the -same results as does the earlier `walk_array()' function: - - BEGIN { - a[1] = 1 - a[2][1] = 21 - a[2][2] = 22 - a[3] = 3 - a[4][1][1] = 411 - a[4][2] = 42 - - process_array(a, "a", "do_print", 0) - } - - function do_print(name, element) - { - printf "%s = %s\n", name, element - } +found in processing arrays. This is described in *note Walking Arrays::. Remember that you must supply a leading `@' in front of an indirect function call. @@ -14430,7 +14476,7 @@ File: gawk.info, Node: Library Functions, Next: Sample Programs, Prev: Functi *note User-defined::, describes how to write your own `awk' functions. Writing functions is important, because it allows you to encapsulate algorithms and program tasks in a single place. It simplifies -programming, making program development more manageable, and making +programming, making program development more manageable and making programs more readable. In their seminal 1976 book, `Software Tools',(1) Brian Kernighan and @@ -14535,7 +14581,7 @@ often use variable names like these for their own purposes. The example programs shown in this major node all start the names of their private variables with an underscore (`_'). Users generally don't use leading underscores in their variable names, so this -convention immediately decreases the chances that the variable name +convention immediately decreases the chances that the variable names will be accidentally shared with the user's program. In addition, several of the library functions use a prefix that helps @@ -14548,7 +14594,7 @@ for private function names.(1) As a final note on variable naming, if a function makes global variables available for use by a main program, it is a good convention -to start that variable's name with a capital letter--for example, +to start those variables' names with a capital letter--for example, `getopt()''s `Opterr' and `Optind' variables (*note Getopt Function::). The leading capital letter indicates that it is global, while the fact that the variable name is not all capital letters indicates that the @@ -14556,7 +14602,7 @@ variable is not one of `awk''s predefined variables, such as `FS'. It is also important that _all_ variables in library functions that do not need to save state are, in fact, declared local.(2) If this is -not done, the variable could accidentally be used in the user's +not done, the variables could accidentally be used in the user's program, leading to bugs that are very difficult to track down: function lib_func(x, y, l1, l2) @@ -14734,7 +14780,7 @@ for use in printing the diagnostic message. This is not possible in `awk', so this `assert()' function also requires a string version of the condition that is being tested. Following is the function: - # assert --- assert that a condition is true. Otherwise exit. + # assert --- assert that a condition is true. Otherwise, exit. function assert(condition, string) { @@ -14755,7 +14801,7 @@ the condition that is being tested. Following is the function: false, it prints a message to standard error, using the `string' parameter to describe the failed condition. It then sets the variable `_assert_exit' to one and executes the `exit' statement. The `exit' -statement jumps to the `END' rule. If the `END' rules finds +statement jumps to the `END' rule. If the `END' rule finds `_assert_exit' to be true, it exits immediately. The purpose of the test in the `END' rule is to keep any other `END' @@ -14970,9 +15016,9 @@ the strings in an array into one long string. The following function, `join()', accomplishes this task. It is used later in several of the application programs (*note Sample Programs::). - Good function design is important; this function needs to be general -but it should also have a reasonable default behavior. It is called -with an array as well as the beginning and ending indices of the + Good function design is important; this function needs to be +general, but it should also have a reasonable default behavior. It is +called with an array as well as the beginning and ending indices of the elements in the array to be merged. This assumes that the array indices are numeric--a reasonable assumption, as the array was likely created with `split()' (*note String Functions::): @@ -15091,7 +15137,7 @@ optional timestamp value to use instead of the current time. File: gawk.info, Node: Readfile Function, Next: Shell Quoting, Prev: Getlocaltime Function, Up: General Functions -10.2.8 Reading a Whole File At Once +10.2.8 Reading a Whole File at Once ----------------------------------- Often, it is convenient to have the entire contents of a file available @@ -15133,13 +15179,13 @@ reads the entire contents of the named file in one shot: It works by setting `RS' to `^$', a regular expression that will never match if the file has contents. `gawk' reads data from the file -into `tmp' attempting to match `RS'. The match fails after each read, +into `tmp', attempting to match `RS'. The match fails after each read, but fails quickly, such that `gawk' fills `tmp' with the entire contents of the file. (*Note Records::, for information on `RT' and `RS'.) In the case that `file' is empty, the return value is the null -string. Thus calling code may use something like: +string. Thus, calling code may use something like: contents = readfile("/some/path") if (length(contents) == 0) @@ -15206,7 +15252,7 @@ three-character string `"\"'\""': File: gawk.info, Node: Data File Management, Next: Getopt Function, Prev: General Functions, Up: Library Functions -10.3 Data File Management +10.3 Data file Management ========================= This minor node presents functions that are useful for managing @@ -15223,14 +15269,15 @@ command-line data files. File: gawk.info, Node: Filetrans Function, Next: Rewind Function, Up: Data File Management -10.3.1 Noting Data File Boundaries +10.3.1 Noting Data file Boundaries ---------------------------------- The `BEGIN' and `END' rules are each executed exactly once, at the beginning and end of your `awk' program, respectively (*note BEGIN/END::). We (the `gawk' authors) once had a user who mistakenly -thought that the `BEGIN' rule is executed at the beginning of each data -file and the `END' rule is executed at the end of each data file. +thought that the `BEGIN' rules were executed at the beginning of each +data file and the `END' rules were executed at the end of each data +file. When informed that this was not the case, the user requested that we add new special patterns to `gawk', named `BEGIN_FILE' and `END_FILE', @@ -15264,7 +15311,7 @@ does so _portably_; this works with any implementation of `awk': This file must be loaded before the user's "main" program, so that the rule it supplies is executed first. - This rule relies on `awk''s `FILENAME' variable that automatically + This rule relies on `awk''s `FILENAME' variable, which automatically changes for each new data file. The current file name is saved in a private variable, `_oldfilename'. If `FILENAME' does not equal `_oldfilename', then a new data file is being processed and it is @@ -15279,7 +15326,7 @@ correctly even for the first data file. The program also supplies an `END' rule to do the final processing for the last file. Because this `END' rule comes before any `END' rules supplied in the "main" program, `endfile()' is called first. Once -again the value of multiple `BEGIN' and `END' rules should be clear. +again, the value of multiple `BEGIN' and `END' rules should be clear. If the same data file occurs twice in a row on the command line, then `endfile()' and `beginfile()' are not executed at the end of the first @@ -15306,7 +15353,7 @@ how it simplifies writing the main program. You are probably wondering, if `beginfile()' and `endfile()' functions can do the job, why does `gawk' have `BEGINFILE' and -`ENDFILE' patterns (*note BEGINFILE/ENDFILE::)? +`ENDFILE' patterns? Good question. Normally, if `awk' cannot open a file, this causes an immediate fatal error. In this case, there is no way for a @@ -15314,7 +15361,8 @@ user-defined function to deal with the problem, as the mechanism for calling it relies on the file being open and at the first record. Thus, the main reason for `BEGINFILE' is to give you a "hook" to catch files that cannot be processed. `ENDFILE' exists for symmetry, and because -it provides an easy way to do per-file cleanup processing. +it provides an easy way to do per-file cleanup processing. For more +information, refer to *note BEGINFILE/ENDFILE::. File: gawk.info, Node: Rewind Function, Next: File Checking, Prev: Filetrans Function, Up: Data File Management @@ -15322,15 +15370,14 @@ File: gawk.info, Node: Rewind Function, Next: File Checking, Prev: Filetrans 10.3.2 Rereading the Current File --------------------------------- -Another request for a new built-in function was for a `rewind()' -function that would make it possible to reread the current file. The -requesting user didn't want to have to use `getline' (*note Getline::) -inside a loop. +Another request for a new built-in function was for a function that +would make it possible to reread the current file. The requesting user +didn't want to have to use `getline' (*note Getline::) inside a loop. However, as long as you are not in the `END' rule, it is quite easy to arrange to immediately close the current input file and then start -over with it from the top. For lack of a better name, we'll call it -`rewind()': +over with it from the top. For lack of a better name, we'll call the +function `rewind()': # rewind.awk --- rewind the current file and start over @@ -15360,7 +15407,7 @@ rule finishes!) File: gawk.info, Node: File Checking, Next: Empty Files, Prev: Rewind Function, Up: Data File Management -10.3.3 Checking for Readable Data Files +10.3.3 Checking for Readable Data files --------------------------------------- Normally, if you give `awk' a data file that isn't readable, it stops @@ -15388,7 +15435,7 @@ longer in the list). See also *note ARGC and ARGV::. Because `awk' variable names only allow the English letters, the regular expression check purposely does not use character classes such -as `[:alpha:]' and `[:alnum:]' (*note Bracket Expressions::) +as `[:alpha:]' and `[:alnum:]' (*note Bracket Expressions::). ---------- Footnotes ---------- @@ -15399,14 +15446,14 @@ opened. However, the code here provides a portable solution. File: gawk.info, Node: Empty Files, Next: Ignoring Assigns, Prev: File Checking, Up: Data File Management -10.3.4 Checking for Zero-length Files +10.3.4 Checking for Zero-Length Files ------------------------------------- All known `awk' implementations silently skip over zero-length files. This is a by-product of `awk''s implicit read-a-record-and-match-against-the-rules loop: when `awk' tries to -read a record from an empty file, it immediately receives an end of -file indication, closes the file, and proceeds on to the next +read a record from an empty file, it immediately receives an +end-of-file indication, closes the file, and proceeds on to the next command-line data file, _without_ executing any user-level `awk' program code. @@ -15450,13 +15497,13 @@ of the `for' loop uses the `<=' operator, not `<'. File: gawk.info, Node: Ignoring Assigns, Prev: Empty Files, Up: Data File Management -10.3.5 Treating Assignments as File Names +10.3.5 Treating Assignments as File names ----------------------------------------- Occasionally, you might not want `awk' to process command-line variable assignments (*note Assignment Options::). In particular, if you have a file name that contains an `=' character, `awk' treats the file name as -an assignment, and does not process it. +an assignment and does not process it. Some users have suggested an additional command-line option for `gawk' to disable command-line assignments. However, some simple @@ -15746,8 +15793,8 @@ which is in `ARGV[0]': } } - The rest of the `BEGIN' rule is a simple test program. Here is the -result of two sample runs of the test program: + The rest of the `BEGIN' rule is a simple test program. Here are the +results of two sample runs of the test program: $ awk -f getopt.awk -v _getopt_test=1 -- -a -cbARG bax -x -| c = <a>, Optarg = <> @@ -15793,10 +15840,10 @@ File: gawk.info, Node: Passwd Functions, Next: Group Functions, Prev: Getopt ============================== The `PROCINFO' array (*note Built-in Variables::) provides access to -the current user's real and effective user and group ID numbers, and if -available, the user's supplementary group set. However, because these -are numbers, they do not provide very useful information to the average -user. There needs to be some way to find the user information +the current user's real and effective user and group ID numbers, and, +if available, the user's supplementary group set. However, because +these are numbers, they do not provide very useful information to the +average user. There needs to be some way to find the user information associated with the user and group ID numbers. This minor node presents a suite of functions for retrieving information from the user database. *Note Group Functions::, for a similar suite that retrieves @@ -15807,7 +15854,7 @@ kept. Instead, it provides the `<pwd.h>' header file and several C language subroutines for obtaining user information. The primary function is `getpwent()', for "get password entry." The "password" comes from the original user database file, `/etc/passwd', which stores -user information, along with the encrypted passwords (hence the name). +user information along with the encrypted passwords (hence the name). Although an `awk' program could simply read `/etc/passwd' directly, this file may not contain complete information about the system's set @@ -15855,7 +15902,7 @@ Encrypted password User-ID The user's numeric user ID number. (On some systems, it's a C - `long', and not an `int'. Thus we cast it to `long' for all + `long', and not an `int'. Thus, we cast it to `long' for all cases.) Group-ID @@ -15954,8 +16001,8 @@ or on some other `awk' implementation. `PROCINFO["FS"]', is similar. The main part of the function uses a loop to read database lines, -split the line into fields, and then store the line into each array as -necessary. When the loop is done, `_pw_init()' cleans up by closing +split the lines into fields, and then store the lines into each array +as necessary. When the loop is done, `_pw_init()' cleans up by closing the pipeline, setting `_pw_inited' to one, and restoring `FS' (and `FIELDWIDTHS' or `FPAT' if necessary), `RS', and `$0'. The use of `_pw_count' is explained shortly. @@ -16083,7 +16130,7 @@ Group Password Group ID Number The group's numeric group ID number; the association of name to number must be unique within the file. (On some systems it's a C - `long', and not an `int'. Thus we cast it to `long' for all + `long', and not an `int'. Thus, we cast it to `long' for all cases.) Group Member List @@ -16173,29 +16220,30 @@ to ensure that the database is scanned no more than once. The `_gr_init()' function first saves `FS', `RS', and `$0', and then sets `FS' and `RS' to the correct values for scanning the group information. It also takes care to note whether `FIELDWIDTHS' or `FPAT' is being -used, and to restore the appropriate field splitting mechanism. +used, and to restore the appropriate field-splitting mechanism. - The group information is stored is several associative arrays. The + The group information is stored in several associative arrays. The arrays are indexed by group name (`_gr_byname'), by group ID number (`_gr_bygid'), and by position in the database (`_gr_bycount'). There is an additional array indexed by username (`_gr_groupsbyuser'), which is a space-separated list of groups to which each user belongs. - Unlike the user database, it is possible to have multiple records in -the database for the same group. This is common when a group has a + Unlike in the user database, it is possible to have multiple records +in the database for the same group. This is common when a group has a large number of members. A pair of such entries might look like the following: - tvpeople:*:101:johny,jay,arsenio + tvpeople:*:101:johnny,jay,arsenio tvpeople:*:101:david,conan,tom,joan For this reason, `_gr_init()' looks to see if a group name or group -ID number is already seen. If it is, the usernames are simply +ID number is already seen. If so, the usernames are simply concatenated onto the previous list of users.(1) Finally, `_gr_init()' closes the pipeline to `grcat', restores `FS' -(and `FIELDWIDTHS' or `FPAT' if necessary), `RS', and `$0', initializes -`_gr_count' to zero (it is used later), and makes `_gr_inited' nonzero. +(and `FIELDWIDTHS' or `FPAT', if necessary), `RS', and `$0', +initializes `_gr_count' to zero (it is used later), and makes +`_gr_inited' nonzero. The `getgrnam()' function takes a group name as its argument, and if that group exists, it is returned. Otherwise, it relies on the array @@ -16258,9 +16306,9 @@ very simple, relying on `awk''s associative arrays to do work. ---------- Footnotes ---------- - (1) There is actually a subtle problem with the code just presented. -Suppose that the first time there were no names. This code adds the -names with a leading comma. It also doesn't check that there is a `$4'. + (1) There is a subtle problem with the code just presented. Suppose +that the first time there were no names. This code adds the names with +a leading comma. It also doesn't check that there is a `$4'. File: gawk.info, Node: Walking Arrays, Next: Library Functions Summary, Prev: Group Functions, Up: Library Functions @@ -16269,11 +16317,11 @@ File: gawk.info, Node: Walking Arrays, Next: Library Functions Summary, Prev: ================================ *note Arrays of Arrays::, described how `gawk' provides arrays of -arrays. In particular, any element of an array may be either a scalar, +arrays. In particular, any element of an array may be either a scalar or another array. The `isarray()' function (*note Type Functions::) lets you distinguish an array from a scalar. The following function, -`walk_array()', recursively traverses an array, printing each element's -indices and value. You call it with the array and a string +`walk_array()', recursively traverses an array, printing the element +indices and values. You call it with the array and a string representing the name of the array: function walk_array(arr, name, i) @@ -16313,6 +16361,61 @@ value. Here is a main program to demonstrate: -| a[4][1][1] = 411 -| a[4][2] = 42 + The function just presented simply prints the name and value of each +scalar array element. However, it is easy to generalize it, by passing +in the name of a function to call when walking an array. The modified +function looks like this: + + function process_array(arr, name, process, do_arrays, i, new_name) + { + for (i in arr) { + new_name = (name "[" i "]") + if (isarray(arr[i])) { + if (do_arrays) + @process(new_name, arr[i]) + process_array(arr[i], new_name, process, do_arrays) + } else + @process(new_name, arr[i]) + } + } + + The arguments are as follows: + +`arr' + The array. + +`name' + The name of the array (a string). + +`process' + The name of the function to call. + +`do_arrays' + If this is true, the function can handle elements that are + subarrays. + + If subarrays are to be processed, that is done before walking them +further. + + When run with the following scaffolding, the function produces the +same results as does the earlier version of `walk_array()': + + BEGIN { + a[1] = 1 + a[2][1] = 21 + a[2][2] = 22 + a[3] = 3 + a[4][1][1] = 411 + a[4][2] = 42 + + process_array(a, "a", "do_print", 0) + } + + function do_print(name, element) + { + printf "%s = %s\n", name, element + } + File: gawk.info, Node: Library Functions Summary, Next: Library Exercises, Prev: Walking Arrays, Up: Library Functions @@ -16330,24 +16433,24 @@ File: gawk.info, Node: Library Functions Summary, Next: Library Exercises, Pr * The functions presented here fit into the following categories: General problems - Number-to-string conversion, assertions, rounding, random - number generation, converting characters to numbers, joining - strings, getting easily usable time-of-day information, and - reading a whole file in one shot. + Number-to-string conversion, testing assertions, rounding, + random number generation, converting characters to numbers, + joining strings, getting easily usable time-of-day + information, and reading a whole file in one shot Managing data files Noting data file boundaries, rereading the current file, checking for readable files, checking for zero-length files, - and treating assignments as file names. + and treating assignments as file names Processing command-line options - An `awk' version of the standard C `getopt()' function. + An `awk' version of the standard C `getopt()' function Reading the user and group databases - Two sets of routines that parallel the C library versions. + Two sets of routines that parallel the C library versions Traversing arrays of arrays - A simple function to traverse an array of arrays to any depth. + Two functions that traverse an array of arrays to any depth @@ -16442,7 +16545,7 @@ you. to replace the installed versions on your system. Nor may all of these programs be fully compliant with the most recent POSIX standard. This is not a problem; their purpose is to illustrate `awk' language -programming for "real world" tasks. +programming for "real-world" tasks. The programs are presented in alphabetical order. @@ -16468,7 +16571,7 @@ separated by TABs by default, but you may supply a command-line option to change the field "delimiter" (i.e., the field-separator character). `cut''s definition of fields is less general than `awk''s. - A common use of `cut' might be to pull out just the login name of + A common use of `cut' might be to pull out just the login names of logged-on users from the output of `who'. For example, the following pipeline generates a sorted, unique list of the logged-on users: @@ -16877,7 +16980,7 @@ unsuccessful match. If the line does not match, the `next' statement just moves on to the next record. A number of additional tests are made, but they are only done if we -are not counting lines. First, if the user only wants exit status +are not counting lines. First, if the user only wants the exit status (`no_print' is true), then it is enough to know that _one_ line in this file matched, and we can skip on to the next file with `nextfile'. Similarly, if we are only printing file names, we can print the file @@ -16911,7 +17014,7 @@ line is printed, with a leading file name and colon if necessary: } The `END' rule takes care of producing the correct exit status. If -there are no matches, the exit status is one; otherwise it is zero: +there are no matches, the exit status is one; otherwise, it is zero: END { exit (total == 0) @@ -16953,7 +17056,8 @@ a more palatable output than just individual numbers. Here is a simple version of `id' written in `awk'. It uses the user database library functions (*note Passwd Functions::) and the group -database library functions (*note Group Functions::): +database library functions (*note Group Functions::) from *note Library +Functions::. The program is fairly straightforward. All the work is done in the `BEGIN' rule. The user and group ID numbers are obtained from @@ -17050,8 +17154,8 @@ is as follows:(1) By default, the output files are named `xaa', `xab', and so on. Each file has 1,000 lines in it, with the likely exception of the last file. To change the number of lines in each file, supply a number on the -command line preceded with a minus (e.g., `-500' for files with 500 -lines in them instead of 1,000). To change the name of the output +command line preceded with a minus sign (e.g., `-500' for files with +500 lines in them instead of 1,000). To change the names of the output files to something like `myfileaa', `myfileab', and so on, supply an additional argument that specifies the file name prefix. @@ -17688,7 +17792,7 @@ checking and setting of defaults: the delay, the count, and the message to print. If the user supplied a message without the ASCII BEL character (known as the "alert" character, `"\a"'), then it is added to the message. (On many systems, printing the ASCII BEL generates an -audible alert. Thus when the alarm goes off, the system calls attention +audible alert. Thus, when the alarm goes off, the system calls attention to itself in case the user is not looking at the computer.) Just for a change, this program uses a `switch' statement (*note Switch Statement::), but the processing could be done with a series of @@ -17820,7 +17924,7 @@ the "from" list. Once upon a time, a user proposed adding a transliteration function to `gawk'. The following program was written to prove that character transliteration could be done with a user-level function. This program -is not as complete as the system `tr' utility but it does most of the +is not as complete as the system `tr' utility, but it does most of the job. The `translate' program was written long before `gawk' acquired the @@ -17830,13 +17934,13 @@ and `gsub()' built-in functions (*note String Functions::). There are two functions. The first, `stranslate()', takes three arguments: `from' - A list of characters from which to translate. + A list of characters from which to translate `to' - A list of characters to which to translate. + A list of characters to which to translate `target' - The string on which to do the translation. + The string on which to do the translation Associative arrays make the translation part fairly easy. `t_ar' holds the "to" characters, indexed by the "from" characters. Then a @@ -17844,7 +17948,7 @@ simple loop goes through `from', one character at a time. For each character in `from', if the character appears in `target', it is replaced with the corresponding `to' character. - The `translate()' function calls `stranslate()' using `$0' as the + The `translate()' function calls `stranslate()', using `$0' as the target. The main program sets two global variables, `FROM' and `TO', from the command line, and then changes `ARGV' so that `awk' reads from the standard input. @@ -17853,7 +17957,7 @@ the standard input. record: # translate.awk --- do tr-like stuff - # Bugs: does not handle things like: tr A-Z a-z, it has + # Bugs: does not handle things like tr A-Z a-z; it has # to be spelled out. However, if `to' is shorter than `from', # the last character in `to' is used for the rest of `from'. @@ -17931,13 +18035,13 @@ File: gawk.info, Node: Labels Program, Next: Word Sorting, Prev: Translate Pr 11.3.4 Printing Mailing Labels ------------------------------ -Here is a "real world"(1) program. This script reads lists of names and +Here is a "real-world"(1) program. This script reads lists of names and addresses and generates mailing labels. Each page of labels has 20 labels on it, two across and 10 down. The addresses are guaranteed to be no more than five lines of data. Each address is separated from the next by a blank line. - The basic idea is to read 20 labels worth of data. Each line of + The basic idea is to read 20 labels' worth of data. Each line of each label is stored in the `line' array. The single rule takes care of filling the `line' array and printing the page when 20 labels have been read. @@ -17949,13 +18053,13 @@ splits records at blank lines (*note Records::). It sets `MAXLINES' to Most of the work is done in the `printpage()' function. The label lines are stored sequentially in the `line' array. But they have to -print horizontally; `line[1]' next to `line[6]', `line[2]' next to +print horizontally: `line[1]' next to `line[6]', `line[2]' next to `line[7]', and so on. Two loops accomplish this. The outer loop, controlled by `i', steps through every 10 lines of data; this is each row of labels. The inner loop, controlled by `j', goes through the -lines within the row. As `j' goes from 0 to 4, `i+j' is the `j'-th -line in the row, and `i+j+5' is the entry next to it. The output ends -up looking something like this: +lines within the row. As `j' goes from 0 to 4, `i+j' is the `j'th line +in the row, and `i+j+5' is the entry next to it. The output ends up +looking something like this: line 1 line 6 line 2 line 7 @@ -18058,8 +18162,8 @@ a useful format. printf "%s\t%d\n", word, freq[word] } - The program relies on `awk''s default field splitting mechanism to -break each line up into "words," and uses an associative array named + The program relies on `awk''s default field-splitting mechanism to +break each line up into "words" and uses an associative array named `freq', indexed by each word, to count the number of times the word occurs. In the `END' rule, it prints the counts. @@ -18145,7 +18249,7 @@ File: gawk.info, Node: History Sorting, Next: Extract Program, Prev: Word Sor 11.3.6 Removing Duplicates from Unsorted Text --------------------------------------------- -The `uniq' program (*note Uniq Program::), removes duplicate lines from +The `uniq' program (*note Uniq Program::) removes duplicate lines from _sorted_ data. Suppose, however, you need to remove duplicate lines from a data @@ -18198,7 +18302,7 @@ hand. Here we present a program that can extract parts of a Texinfo input file into separate files. This Info file is written in Texinfo -(http://www.gnu.org/software/texinfo/), the GNU project's document +(http://www.gnu.org/software/texinfo/), the GNU Project's document formatting language. A single Texinfo source file can be used to produce both printed documentation, with TeX, and online documentation. (The Texinfo language is described fully, starting with *note @@ -18239,7 +18343,7 @@ them in a standard directory where `gawk' can find them. The Texinfo file looks something like this: ... - This program has a @code{BEGIN} rule, + This program has a @code{BEGIN} rule that prints a nice message: @example @@ -18264,7 +18368,7 @@ upper- and lowercase letters in the directives won't matter. given (`NF' is at least three) and also checking that the command exits with a zero exit status, signifying OK: - # extract.awk --- extract files and run programs from texinfo files + # extract.awk --- extract files and run programs from Texinfo files BEGIN { IGNORECASE = 1 } @@ -18291,11 +18395,11 @@ The variable `e' is used so that the rule fits nicely on the screen. file name is given in the directive. If the file named is not the current file, then the current file is closed. Keeping the current file open until a new file is encountered allows the use of the `>' -redirection for printing the contents, keeping open file management +redirection for printing the contents, keeping open-file management simple. The `for' loop does the work. It reads lines using `getline' (*note -Getline::). For an unexpected end of file, it calls the +Getline::). For an unexpected end-of-file, it calls the `unexpected_eof()' function. If the line is an "endfile" line, then it breaks out of the loop. If the line is an `@group' or `@end group' line, then it ignores it and goes on to the next line. Similarly, @@ -18385,10 +18489,10 @@ File: gawk.info, Node: Simple Sed, Next: Igawk Program, Prev: Extract Program 11.3.8 A Simple Stream Editor ----------------------------- -The `sed' utility is a stream editor, a program that reads a stream of -data, makes changes to it, and passes it on. It is often used to make -global changes to a large file or to a stream of data generated by a -pipeline of commands. Although `sed' is a complicated program in its +The `sed' utility is a "stream editor", a program that reads a stream +of data, makes changes to it, and passes it on. It is often used to +make global changes to a large file or to a stream of data generated by +a pipeline of commands. Although `sed' is a complicated program in its own right, its most common use is to perform global substitutions in the middle of a pipeline: @@ -18502,7 +18606,7 @@ include a library function twice. `igawk' should behave just like `gawk' externally. This means it should accept all of `gawk''s command-line arguments, including the -ability to have multiple source files specified via `-f', and the +ability to have multiple source files specified via `-f' and the ability to mix command-line and library source files. The program is written using the POSIX Shell (`sh') command @@ -18532,8 +18636,8 @@ language.(1) It works as follows: file names). This program uses shell variables extensively: for storing -command-line arguments, the text of the `awk' program that will expand -the user's program, for the user's original program, and for the +command-line arguments and the text of the `awk' program that will +expand the user's program, for the user's original program, and for the expanded program. Doing so removes some potential problems that might arise were we to use temporary files instead, at the cost of making the script somewhat more complicated. @@ -18791,7 +18895,7 @@ It's done in these steps: The last step is to call `gawk' with the expanded program, along with the original options and command-line arguments that the user -supplied. +supplied: eval gawk $opts -- '"$processed_program"' '"$@"' @@ -18854,15 +18958,15 @@ One word is an anagram of another if both words contain the same letters Column 2, Problem C, of Jon Bentley's `Programming Pearls', Second Edition, presents an elegant algorithm. The idea is to give words that are anagrams a common signature, sort all the words together by their -signature, and then print them. Dr. Bentley observes that taking the -letters in each word and sorting them produces that common signature. +signatures, and then print them. Dr. Bentley observes that taking the +letters in each word and sorting them produces those common signatures. The following program uses arrays of arrays to bring together words with the same signature and array sorting to print the words in sorted order: - # anagram.awk --- An implementation of the anagram finding algorithm - # from Jon Bentley's "Programming Pearls", 2nd edition. + # anagram.awk --- An implementation of the anagram-finding algorithm + # from Jon Bentley's "Programming Pearls," 2nd edition. # Addison Wesley, 2000, ISBN 0-201-65788-0. # Column 2, Problem C, section 2.8, pp 18-20. @@ -18882,7 +18986,7 @@ signature; the second dimension is the word itself: apart into individual letters, sorts the letters, and then joins them back together: - # word2key --- split word apart into letters, sort, joining back together + # word2key --- split word apart into letters, sort, and join back together function word2key(word, a, i, n, result) { @@ -18980,12 +19084,13 @@ File: gawk.info, Node: Programs Summary, Next: Programs Exercises, Prev: Misc characters. The ability to use `split()' with the empty string as the separator can considerably simplify such tasks. - * The library functions from *note Library Functions::, proved their - usefulness for a number of real (if small) programs. + * The examples here demonstrate the usefulness of the library + functions from *note Library Functions::, for a number of real (if + small) programs. * Besides reinventing POSIX wheels, other programs solved a - selection of interesting problems, such as finding duplicates - words in text, printing mailing labels, and finding anagrams. + selection of interesting problems, such as finding duplicate words + in text, printing mailing labels, and finding anagrams. @@ -19102,16 +19207,16 @@ File: gawk.info, Node: Advanced Features, Next: Internationalization, Prev: S This major node discusses advanced features in `gawk'. It's a bit of a "grab bag" of items that are otherwise unrelated to each other. -First, a command-line option allows `gawk' to recognize nondecimal -numbers in input data, not just in `awk' programs. Then, `gawk''s -special features for sorting arrays are presented. Next, two-way I/O, -discussed briefly in earlier parts of this Info file, is described in -full detail, along with the basics of TCP/IP networking. Finally, -`gawk' can "profile" an `awk' program, making it possible to tune it -for performance. +First, we look at a command-line option that allows `gawk' to recognize +nondecimal numbers in input data, not just in `awk' programs. Then, +`gawk''s special features for sorting arrays are presented. Next, +two-way I/O, discussed briefly in earlier parts of this Info file, is +described in full detail, along with the basics of TCP/IP networking. +Finally, we see how `gawk' can "profile" an `awk' program, making it +possible to tune it for performance. - A number of advanced features require separate major nodes of their -own: + Additional advanced features are discussed in separate major nodes +of their own: * *note Internationalization::, discusses how to internationalize your `awk' programs, so that they can speak multiple national @@ -19185,7 +19290,7 @@ File: gawk.info, Node: Array Sorting, Next: Two-way I/O, Prev: Nondecimal Dat 12.2 Controlling Array Traversal and Array Sorting ================================================== -`gawk' lets you control the order in which a `for (i in array)' loop +`gawk' lets you control the order in which a `for (INDX in ARRAY)' loop traverses an array. In addition, two built-in functions, `asort()' and `asorti()', let @@ -19204,9 +19309,9 @@ File: gawk.info, Node: Controlling Array Traversal, Next: Array Sorting Functi 12.2.1 Controlling Array Traversal ---------------------------------- -By default, the order in which a `for (i in array)' loop scans an array -is not defined; it is generally based upon the internal implementation -of arrays inside `awk'. +By default, the order in which a `for (INDX in ARRAY)' loop scans an +array is not defined; it is generally based upon the internal +implementation of arrays inside `awk'. Often, though, it is desirable to be able to loop over the elements in a particular order that you, the programmer, choose. `gawk' lets @@ -19228,21 +19333,22 @@ arguments: RETURN < 0; 0; OR > 0 } - Here, I1 and I2 are the indices, and V1 and V2 are the corresponding -values of the two elements being compared. Either V1 or V2, or both, -can be arrays if the array being traversed contains subarrays as values. -(*Note Arrays of Arrays::, for more information about subarrays.) The -three possible return values are interpreted as follows: + Here, `i1' and `i2' are the indices, and `v1' and `v2' are the +corresponding values of the two elements being compared. Either `v1' +or `v2', or both, can be arrays if the array being traversed contains +subarrays as values. (*Note Arrays of Arrays::, for more information +about subarrays.) The three possible return values are interpreted as +follows: `comp_func(i1, v1, i2, v2) < 0' - Index I1 comes before index I2 during loop traversal. + Index `i1' comes before index `i2' during loop traversal. `comp_func(i1, v1, i2, v2) == 0' - Indices I1 and I2 come together but the relative order with + Indices `i1' and `i2' come together, but the relative order with respect to each other is undefined. `comp_func(i1, v1, i2, v2) > 0' - Index I1 comes after index I2 during loop traversal. + Index `i1' comes after index `i2' during loop traversal. Our first comparison function can be used to scan an array in numerical order of the indices: @@ -19385,7 +19491,7 @@ elements compare equal. This is usually not a problem, but letting the tied elements come out in arbitrary order can be an issue, especially when comparing item values. The partial ordering of the equal elements may change the next time the array is traversed, if other elements are -added or removed from the array. One way to resolve ties when +added to or removed from the array. One way to resolve ties when comparing elements with otherwise equal values is to include the indices in the comparison rules. Note that doing this may make the loop traversal less efficient, so consider it only if necessary. The @@ -19419,14 +19525,14 @@ lowercase letters as equivalent or distinct. Another point to keep in mind is that in the case of subarrays, the element values can themselves be arrays; a production comparison -function should use the `isarray()' function (*note Type Functions::), +function should use the `isarray()' function (*note Type Functions::) to check for this, and choose a defined sorting order for subarrays. All sorting based on `PROCINFO["sorted_in"]' is disabled in POSIX mode, because the `PROCINFO' array is not special in that case. As a side note, sorting the array indices before traversing the -array has been reported to add 15% to 20% overhead to the execution +array has been reported to add a 15% to 20% overhead to the execution time of `awk' programs. For this reason, sorted array traversal is not the default. @@ -19475,8 +19581,8 @@ array is not affected. Often, what's needed is to sort on the values of the _indices_ instead of the values of the elements. To do that, use the `asorti()' function. The interface and behavior are identical to that of -`asort()', except that the index values are used for sorting, and -become the values of the result array: +`asort()', except that the index values are used for sorting and become +the values of the result array: { source[$0] = some_func($0) } @@ -19508,8 +19614,8 @@ chooses_, taking into account just the indices, just the values, or both. This is extremely powerful. Once the array is sorted, `asort()' takes the _values_ in their -final order, and uses them to fill in the result array, whereas -`asorti()' takes the _indices_ in their final order, and uses them to +final order and uses them to fill in the result array, whereas +`asorti()' takes the _indices_ in their final order and uses them to fill in the result array. NOTE: Copying array indices and elements isn't expensive in terms @@ -19707,7 +19813,7 @@ REMOTE-PORT name. NOTE: Failure in opening a two-way socket will result in a - non-fatal error being returned to the calling code. The value of + nonfatal error being returned to the calling code. The value of `ERRNO' indicates the error (*note Auto-set::). Consider the following very simple example: @@ -19788,8 +19894,8 @@ First, the `awk' program: junk Here is the `awkprof.out' that results from running the `gawk' -profiler on this program and data. (This example also illustrates that -`awk' programmers sometimes get up very early in the morning to work.) +profiler on this program and data (this example also illustrates that +`awk' programmers sometimes get up very early in the morning to work): # gawk profile, created Mon Sep 29 05:16:21 2014 @@ -19842,7 +19948,7 @@ profiler on this program and data. (This example also illustrates that output. They are as follows: * The program is printed in the order `BEGIN' rules, `BEGINFILE' - rules, pattern/action rules, `ENDFILE' rules, `END' rules and + rules, pattern-action rules, `ENDFILE' rules, `END' rules, and functions, listed alphabetically. Multiple `BEGIN' and `END' rules retain their separate identities, as do multiple `BEGINFILE' and `ENDFILE' rules. @@ -19887,13 +19993,13 @@ output. They are as follows: scalar, it gets parenthesized. * `gawk' supplies leading comments in front of the `BEGIN' and `END' - rules, the `BEGINFILE' and `ENDFILE' rules, the pattern/action + rules, the `BEGINFILE' and `ENDFILE' rules, the pattern-action rules, and the functions. The profiled version of your program may not look exactly like what you typed when you wrote it. This is because `gawk' creates the -profiled version by "pretty printing" its internal representation of +profiled version by "pretty-printing" its internal representation of the program. The advantage to this is that `gawk' can produce a standard representation. Also, things such as: @@ -19943,15 +20049,15 @@ output profile file. produces the profile and the function call trace and then exits. When `gawk' runs on MS-Windows systems, it uses the `INT' and `QUIT' -signals for producing the profile and, in the case of the `INT' signal, +signals for producing the profile, and in the case of the `INT' signal, `gawk' exits. This is because these systems don't support the `kill' command, so the only signals you can deliver to a program are those generated by the keyboard. The `INT' signal is generated by the -`Ctrl-<C>' or `Ctrl-<BREAK>' key, while the `QUIT' signal is generated -by the `Ctrl-<\>' key. +`Ctrl-c' or `Ctrl-BREAK' key, while the `QUIT' signal is generated by +the `Ctrl-\' key. Finally, `gawk' also accepts another option, `--pretty-print'. When -called this way, `gawk' "pretty prints" the program into `awkprof.out', +called this way, `gawk' "pretty-prints" the program into `awkprof.out', without any execution counts. NOTE: Once upon a time, the `--pretty-print' option would also run @@ -20003,7 +20109,7 @@ File: gawk.info, Node: Advanced Features Summary, Prev: Profiling, Up: Advanc two-way communications. * By using special file names with the `|&' operator, you can open a - TCP/IP (or UDP/IP) connection to remote hosts in the Internet. + TCP/IP (or UDP/IP) connection to remote hosts on the Internet. `gawk' supports both IPv4 and IPv6. * You can generate statement count profiles of your program. This @@ -20012,7 +20118,7 @@ File: gawk.info, Node: Advanced Features Summary, Prev: Profiling, Up: Advanc `USR1' signal while profiling causes `gawk' to dump the profile and keep going, including a function call stack. - * You can also just "pretty print" the program. This currently also + * You can also just "pretty-print" the program. This currently also runs the program, but that will change in the next major release. @@ -20056,7 +20162,7 @@ File: gawk.info, Node: I18N and L10N, Next: Explaining gettext, Up: Internati "Internationalization" means writing (or modifying) a program once, in such a way that it can use multiple languages without requiring further -source-code changes. "Localization" means providing the data necessary +source code changes. "Localization" means providing the data necessary for an internationalized program to work in a particular language. Most typically, these terms refer to features such as the language used for printing error messages, the language used to read responses, and @@ -20070,7 +20176,7 @@ File: gawk.info, Node: Explaining gettext, Next: Programmer i18n, Prev: I18N ================== `gawk' uses GNU `gettext' to provide its internationalization features. -The facilities in GNU `gettext' focus on messages; strings printed by a +The facilities in GNU `gettext' focus on messages: strings printed by a program, either directly or via formatting with `printf' or `sprintf()'.(1) @@ -20199,8 +20305,7 @@ File: gawk.info, Node: Programmer i18n, Next: Translator i18n, Prev: Explaini 13.3 Internationalizing `awk' Programs ====================================== -`gawk' provides the following variables and functions for -internationalization: +`gawk' provides the following variables for internationalization: `TEXTDOMAIN' This variable indicates the application's text domain. For @@ -20212,6 +20317,8 @@ internationalization: for translation at runtime. String constants without a leading underscore are not translated. + `gawk' provides the following functions for internationalization: + ``dcgettext(STRING' [`,' DOMAIN [`,' CATEGORY]]`)'' Return the translation of STRING in text domain DOMAIN for locale category CATEGORY. The default value for DOMAIN is the current @@ -20250,8 +20357,7 @@ internationalization: the null string (`""'), then `bindtextdomain()' returns the current binding for the given DOMAIN. - To use these facilities in your `awk' program, follow the steps -outlined in *note Explaining gettext::, like so: + To use these facilities in your `awk' program, follow these steps: 1. Set the variable `TEXTDOMAIN' to the text domain of your program. This is best done in a `BEGIN' rule (*note BEGIN/END::), or it can @@ -20473,7 +20579,7 @@ actually almost portable, requiring very little change: its value, leaving the original string constant as the result. * By defining "dummy" functions to replace `dcgettext()', - `dcngettext()' and `bindtextdomain()', the `awk' program can be + `dcngettext()', and `bindtextdomain()', the `awk' program can be made to run, but all the messages are output in the original language. For example: @@ -20608,9 +20714,9 @@ File: gawk.info, Node: Gawk I18N, Next: I18N Summary, Prev: I18N Example, Up `gawk' itself has been internationalized using the GNU `gettext' package. (GNU `gettext' is described in complete detail in *note (GNU -`gettext' utilities)Top:: gettext, GNU gettext tools.) As of this -writing, the latest version of GNU `gettext' is version 0.19.3 -(ftp://ftp.gnu.org/gnu/gettext/gettext-0.19.3.tar.gz). +`gettext' utilities)Top:: gettext, GNU `gettext' utilities.) As of +this writing, the latest version of GNU `gettext' is version 0.19.4 +(ftp://ftp.gnu.org/gnu/gettext/gettext-0.19.4.tar.gz). If a translation of `gawk''s messages exists, then `gawk' produces usage messages, warnings, and fatal errors in the local language. @@ -20622,7 +20728,7 @@ File: gawk.info, Node: I18N Summary, Prev: Gawk I18N, Up: Internationalizatio ============ * Internationalization means writing a program such that it can use - multiple languages without requiring source-code changes. + multiple languages without requiring source code changes. Localization means providing the data necessary for an internationalized program to work in a particular language. @@ -20636,10 +20742,10 @@ File: gawk.info, Node: I18N Summary, Prev: Gawk I18N, Up: Internationalizatio file, and the `.po' files are compiled into `.gmo' files for use at runtime. - * You can use position specifications with `sprintf()' and `printf' - to rearrange the placement of argument values in formatted strings - and output. This is useful for the translations of format control - strings. + * You can use positional specifications with `sprintf()' and + `printf' to rearrange the placement of argument values in formatted + strings and output. This is useful for the translation of format + control strings. * The internationalization features have been designed so that they can be easily worked around in a standard `awk'. @@ -20695,8 +20801,7 @@ File: gawk.info, Node: Debugging Concepts, Next: Debugging Terms, Up: Debuggi --------------------------- (If you have used debuggers in other languages, you may want to skip -ahead to the next section on the specific features of the `gawk' -debugger.) +ahead to *note Awk Debugging::.) Of course, a debugging program cannot remove bugs for you, because it has no way of knowing what you or your users consider a "bug" versus @@ -20783,11 +20888,11 @@ defines terms used throughout the rest of this major node: File: gawk.info, Node: Awk Debugging, Prev: Debugging Terms, Up: Debugging -14.1.3 Awk Debugging --------------------- +14.1.3 `awk' Debugging +---------------------- Debugging an `awk' program has some specific aspects that are not -shared with other programming languages. +shared with programs written in other languages. First of all, the fact that `awk' programs usually take input line by line from a file or files and operate on those lines using specific @@ -20805,8 +20910,8 @@ commands. File: gawk.info, Node: Sample Debugging Session, Next: List of Debugger Commands, Prev: Debugging, Up: Debugger -14.2 Sample Debugging Session -============================= +14.2 Sample `gawk' Debugging Session +==================================== In order to illustrate the use of `gawk' as a debugger, let's look at a sample debugging session. We will use the `awk' implementation of the @@ -20825,8 +20930,8 @@ File: gawk.info, Node: Debugger Invocation, Next: Finding The Bug, Up: Sample -------------------------------- Starting the debugger is almost exactly like running `gawk' normally, -except you have to pass an additional option `--debug', or the -corresponding short option `-D'. The file(s) containing the program +except you have to pass an additional option, `--debug', or the +corresponding short option, `-D'. The file(s) containing the program and any supporting code are given on the command line as arguments to one or more `-f' options. (`gawk' is not designed to debug command-line programs, only programs contained in files.) In our case, we invoke @@ -20836,7 +20941,7 @@ the debugger like this: where both `getopt.awk' and `uniq.awk' are in `$AWKPATH'. (Experienced users of GDB or similar debuggers should note that this syntax is -slightly different from what they are used to. With the `gawk' +slightly different from what you are used to. With the `gawk' debugger, you give the arguments for running the program in the command line to the debugger rather than as part of the `run' command at the debugger prompt.) The `-1' is an option to `uniq.awk'. @@ -20960,10 +21065,10 @@ typing `n' (for "next"): -| 66 if (fcount > 0) { This tells us that `gawk' is now ready to execute line 66, which -decides whether to give the lines the special "field skipping" treatment +decides whether to give the lines the special "field-skipping" treatment indicated by the `-1' command-line option. (Notice that we skipped -from where we were before at line 63 to here, because the condition in -line 63 `if (fcount == 0 && charcount == 0)' was false.) +from where we were before, at line 63, to here, because the condition +in line 63, `if (fcount == 0 && charcount == 0)', was false.) Continuing to step, we now get to the splitting of the current and last records: @@ -21021,15 +21126,15 @@ mentioned): Well, here we are at our error (sorry to spoil the suspense). What we had in mind was to join the fields starting from the second one to -make the virtual record to compare, and if the first field was numbered -zero, this would work. Let's look at what we've got: +make the virtual record to compare, and if the first field were +numbered zero, this would work. Let's look at what we've got: gawk> p cline clast -| cline = "gawk is a wonderful program!" -| clast = "awk is a wonderful program!" Hey, those look pretty familiar! They're just our original, -unaltered, input records. A little thinking (the human brain is still +unaltered input records. A little thinking (the human brain is still the best debugging tool), and we realize that we were off by one! We get out of the debugger: @@ -21066,11 +21171,11 @@ categories: * Miscellaneous Each of these are discussed in the following subsections. In the -following descriptions, commands which may be abbreviated show the +following descriptions, commands that may be abbreviated show the abbreviation on a second description line. A debugger command name may also be truncated if that partial name is unambiguous. The debugger has the built-in capability to automatically repeat the previous command -just by hitting <Enter>. This works for the commands `list', `next', +just by hitting `Enter'. This works for the commands `list', `next', `nexti', `step', `stepi', and `continue' executed without any argument. * Menu: @@ -21110,8 +21215,8 @@ The commands for controlling breakpoints are: Set a breakpoint at entry to (the first instruction of) function FUNCTION. - Each breakpoint is assigned a number which can be used to delete - it from the breakpoint list using the `delete' command. + Each breakpoint is assigned a number that can be used to delete it + from the breakpoint list using the `delete' command. With a breakpoint, you may also supply a condition. This is an `awk' expression (enclosed in double quotes) that the debugger @@ -21149,26 +21254,26 @@ The commands for controlling breakpoints are: `delete' [N1 N2 ...] [N-M] `d' [N1 N2 ...] [N-M] - Delete specified breakpoints or a range of breakpoints. Deletes - all defined breakpoints if no argument is supplied. + Delete specified breakpoints or a range of breakpoints. Delete all + defined breakpoints if no argument is supplied. `disable' [N1 N2 ... | N-M] Disable specified breakpoints or a range of breakpoints. Without - any argument, disables all breakpoints. + any argument, disable all breakpoints. `enable' [`del' | `once'] [N1 N2 ...] [N-M] `e' [`del' | `once'] [N1 N2 ...] [N-M] Enable specified breakpoints or a range of breakpoints. Without - any argument, enables all breakpoints. Optionally, you can - specify how to enable the breakpoint: + any argument, enable all breakpoints. Optionally, you can specify + how to enable the breakpoints: `del' - Enable the breakpoint(s) temporarily, then delete it when the - program stops at the breakpoint. + Enable the breakpoints temporarily, then delete each one when + the program stops at it. `once' - Enable the breakpoint(s) temporarily, then disable it when - the program stops at the breakpoint. + Enable the breakpoints temporarily, then disable each one when + the program stops at it. `ignore' N COUNT Ignore breakpoint number N the next COUNT times it is hit. @@ -21214,7 +21319,7 @@ execution of the program than we saw in our earlier example: `continue' [COUNT] `c' [COUNT] Resume program execution. If continued from a breakpoint and COUNT - is specified, ignores the breakpoint at that location the next + is specified, ignore the breakpoint at that location the next COUNT times before stopping. `finish' @@ -21249,10 +21354,10 @@ execution of the program than we saw in our earlier example: `step' [COUNT] `s' [COUNT] Continue execution until control reaches a different source line - in the current stack frame. `step' steps inside any function - called within the line. If the argument COUNT is supplied, steps - that many times before stopping, unless it encounters a breakpoint - or watchpoint. + in the current stack frame, stepping inside any function called + within the line. If the argument COUNT is supplied, steps that + many times before stopping, unless it encounters a breakpoint or + watchpoint. `stepi' [COUNT] `si' [COUNT] @@ -21333,13 +21438,13 @@ AWK STATEMENTS (`"'...`"'). You can also set special `awk' variables, such as `FS', `NF', - `NR', and son on. + `NR', and so on. `watch' VAR | `$'N [`"EXPRESSION"'] `w' VAR | `$'N [`"EXPRESSION"'] Add variable VAR (or field `$N') to the watch list. The debugger then stops whenever the value of the variable or field changes. - Each watched item is assigned a number which can be used to delete + Each watched item is assigned a number that can be used to delete it from the watch list using the `unwatch' command. With a watchpoint, you may also supply a condition. This is an @@ -21363,11 +21468,11 @@ File: gawk.info, Node: Execution Stack, Next: Debugger Info, Prev: Viewing An 14.3.4 Working with the Stack ----------------------------- -Whenever you run a program which contains any function calls, `gawk' +Whenever you run a program that contains any function calls, `gawk' maintains a stack of all of the function calls leading up to where the program is right now. You can see how you got to where you are, and also move around in the stack to see what the state of things was in the -functions which called the one you are in. The commands for doing this +functions that called the one you are in. The commands for doing this are: `backtrace' [COUNT] @@ -21387,8 +21492,8 @@ are: `frame' [N] `f' [N] Select and print stack frame N. Frame 0 is the currently - executing, or "innermost", frame (function call), frame 1 is the - frame that called the innermost one. The highest numbered frame is + executing, or "innermost", frame (function call); frame 1 is the + frame that called the innermost one. The highest-numbered frame is the one for the main program. The printed information consists of the frame number, function and argument names, source file, and the source line. @@ -21405,7 +21510,7 @@ File: gawk.info, Node: Debugger Info, Next: Miscellaneous Debugger Commands, Besides looking at the values of variables, there is often a need to get other sorts of information about the state of your program and of the -debugging environment itself. The `gawk' debugger has one command which +debugging environment itself. The `gawk' debugger has one command that provides this information, appropriately called `info'. `info' is used with one of a number of arguments that tell it exactly what you want to know: @@ -21462,11 +21567,12 @@ from a file. The commands are: option. The available options are: `history_size' - The maximum number of lines to keep in the history file + Set the maximum number of lines to keep in the history file `./.gawk_history'. The default is 100. `listsize' - The number of lines that `list' prints. The default is 15. + Specify the number of lines that `list' prints. The default + is 15. `outfile' Send `gawk' output to a file; debugger output still goes to @@ -21474,7 +21580,7 @@ from a file. The commands are: standard output. `prompt' - The debugger prompt. The default is `gawk> '. + Change the debugger prompt. The default is `gawk> '. `save_history' [`on' | `off'] Save command history to file `./.gawk_history'. The default @@ -21482,8 +21588,8 @@ from a file. The commands are: `save_options' [`on' | `off'] Save current options to file `./.gawkrc' upon exit. The - default is `on'. Options are read back in to the next - session upon startup. + default is `on'. Options are read back into the next session + upon startup. `trace' [`on' | `off'] Turn instruction tracing on or off. The default is `off'. @@ -21502,7 +21608,7 @@ from a file. The commands are: commands; however, the `gawk' debugger will not source the same file more than once in order to avoid infinite recursion. - In addition to, or instead of the `source' command, you can use + In addition to, or instead of, the `source' command, you can use the `-D FILE' or `--debug=FILE' command-line options to execute commands from a file non-interactively (*note Options::). @@ -21512,13 +21618,13 @@ File: gawk.info, Node: Miscellaneous Debugger Commands, Prev: Debugger Info, 14.3.6 Miscellaneous Commands ----------------------------- -There are a few more commands which do not fit into the previous +There are a few more commands that do not fit into the previous categories, as follows: `dump' [FILENAME] - Dump bytecode of the program to standard output or to the file + Dump byte code of the program to standard output or to the file named in FILENAME. This prints a representation of the internal - instructions which `gawk' executes to implement the `awk' commands + instructions that `gawk' executes to implement the `awk' commands in a program. This can be very enlightening, as the following partial dump of Davide Brini's obfuscated code (*note Signature Program::) demonstrates: @@ -21602,22 +21708,21 @@ categories, as follows: FILENAME. This command may change the current source file. FUNCTION - Print lines centered around beginning of the function + Print lines centered around the beginning of the function FUNCTION. This command may change the current source file. `quit' `q' Exit the debugger. Debugging is great fun, but sometimes we all have to tend to other obligations in life, and sometimes we find - the bug, and are free to go on to the next one! As we saw - earlier, if you are running a program, the debugger warns you if - you accidentally type `q' or `quit', to make sure you really want - to quit. + the bug and are free to go on to the next one! As we saw earlier, + if you are running a program, the debugger warns you when you type + `q' or `quit', to make sure you really want to quit. `trace' [`on' | `off'] - Turn on or off a continuous printing of instructions which are - about to be executed, along with printing the `awk' line which they - implement. The default is `off'. + Turn on or off continuous printing of the instructions that are + about to be executed, along with the `awk' lines they implement. + The default is `off'. It is to be hoped that most of the "opcodes" in these instructions are fairly self-explanatory, and using `stepi' and `nexti' while @@ -21630,7 +21735,7 @@ File: gawk.info, Node: Readline Support, Next: Limitations, Prev: List of Deb 14.4 Readline Support ===================== -If `gawk' is compiled with the `readline' library +If `gawk' is compiled with the GNU Readline library (http://cnswww.cns.cwru.edu/php/chet/readline/readline.html), you can take advantage of that library's command completion and history expansion features. The following types of completion are available: @@ -21660,7 +21765,7 @@ File: gawk.info, Node: Limitations, Next: Debugging Summary, Prev: Readline S We hope you find the `gawk' debugger useful and enjoyable to work with, but as with any program, especially in its early releases, it still has -some limitations. A few which are worth being aware of are: +some limitations. A few that it's worth being aware of are: * At this point, the debugger does not give a detailed explanation of what you did wrong when you type in something it doesn't like. @@ -21671,13 +21776,13 @@ some limitations. A few which are worth being aware of are: Commands:: (or if you are already familiar with `gawk' internals), you will realize that much of the internal manipulation of data in `gawk', as in many interpreters, is done on a stack. `Op_push', - `Op_pop', and the like, are the "bread and butter" of most `gawk' + `Op_pop', and the like are the "bread and butter" of most `gawk' code. Unfortunately, as of now, the `gawk' debugger does not allow you to examine the stack's contents. That is, the intermediate results of expression evaluation are on the stack, but cannot be - printed. Rather, only variables which are defined in the program + printed. Rather, only variables that are defined in the program can be printed. Of course, a workaround for this is to use more explicit variables at the debugging stage and then change back to obscure, perhaps more optimal code later. @@ -21689,12 +21794,12 @@ some limitations. A few which are worth being aware of are: * The `gawk' debugger is designed to be used by running a program (with all its parameters) on the command line, as described in *note Debugger Invocation::. There is no way (as of now) to - attach or "break in" to a running program. This seems reasonable - for a language which is used mainly for quickly executing, short + attach or "break into" a running program. This seems reasonable + for a language that is used mainly for quickly executing, short programs. - * The `gawk' debugger only accepts source supplied with the `-f' - option. + * The `gawk' debugger only accepts source code supplied with the + `-f' option. File: gawk.info, Node: Debugging Summary, Prev: Limitations, Up: Debugger @@ -21703,8 +21808,8 @@ File: gawk.info, Node: Debugging Summary, Prev: Limitations, Up: Debugger ============ * Programs rarely work correctly the first time. Finding bugs is - "debugging" and a program that helps you find bugs is a - "debugger". `gawk' has a built-in debugger that works very + called debugging, and a program that helps you find bugs is a + debugger. `gawk' has a built-in debugger that works very similarly to the GNU Debugger, GDB. * Debuggers let you step through your program one statement at a @@ -21720,8 +21825,8 @@ File: gawk.info, Node: Debugging Summary, Prev: Limitations, Up: Debugger breakpoints, execution, viewing and changing data, working with the stack, getting information, and other tasks. - * If the `readline' library is available when `gawk' is compiled, it - is used by the debugger to provide command-line history and + * If the GNU Readline library is available when `gawk' is compiled, + it is used by the debugger to provide command-line history and editing. @@ -21782,7 +21887,7 @@ Decimal arithmetic sides) of the decimal point, and the results of a computation are always exact. - Some modern system can do decimal arithmetic in hardware, but + Some modern systems can do decimal arithmetic in hardware, but usually you need a special software library to provide access to these instructions. There are also libraries that do decimal arithmetic entirely in software. @@ -21798,8 +21903,7 @@ Integer arithmetic In computers, integer values come in two flavors: "signed" and "unsigned". Signed values may be negative or positive, whereas - unsigned values are always positive (i.e., greater than or equal - to zero). + unsigned values are always greater than or equal to zero. In computer systems, integer arithmetic is exact, but the possible range of values is limited. Integer arithmetic is generally @@ -21836,12 +21940,6 @@ Numeric representation Minimum value Maximum value 32-bit unsigned integer 0 4,294,967,295 64-bit signed integer -9,223,372,036,854,775,8089,223,372,036,854,775,807 64-bit unsigned integer 0 18,446,744,073,709,551,615 -Single-precision `1.175494e-38' `3.402823e+38' -floating point -(approximate) -Double-precision `2.225074e-308' `1.797693e+308' -floating point -(approximate) Table 15.1: Value ranges for different numeric representations @@ -21857,7 +21955,7 @@ File: gawk.info, Node: Math Definitions, Next: MPFR features, Prev: Computer The rest of this major node uses a number of terms. Here are some informal definitions that should help you work your way through the -material here. +material here: "Accuracy" A floating-point calculation's accuracy is how close it comes to @@ -21877,7 +21975,7 @@ material here. number and infinity produce infinity. "NaN" - "Not A Number."(1) A special value that results from attempting a + "Not a number."(1) A special value that results from attempting a calculation that has no answer as a real number. In such a case, programs can either receive a floating-point exception, or get `NaN' back as the result. The IEEE 754 standard recommends that @@ -21903,15 +22001,15 @@ material here. PREC = 3.322 * DPS - Here, PREC denotes the binary precision (measured in bits) and DPS - (short for decimal places) is the decimal digits. + Here, _prec_ denotes the binary precision (measured in bits) and + _dps_ (short for decimal places) is the decimal digits. "Rounding mode" How numbers are rounded up or down when necessary. More details are provided later. "Significand" - A floating-point value consists the significand multiplied by 10 + A floating-point value consists of the significand multiplied by 10 to the power of the exponent. For example, in `1.2345e67', the significand is `1.2345'. @@ -21933,7 +22031,7 @@ precision formats to allow greater precisions and larger exponent ranges. (`awk' uses only the 64-bit double-precision format.) *note table-ieee-formats:: lists the precision and exponent field -values for the basic IEEE 754 binary formats: +values for the basic IEEE 754 binary formats. Name Total bits Precision Minimum Maximum exponent exponent @@ -21967,7 +22065,7 @@ so: $ gawk --version -| GNU Awk 4.1.2, API: 1.1 (GNU MPFR 3.1.0-p3, GNU MP 5.0.2) - -| Copyright (C) 1989, 1991-2014 Free Software Foundation. + -| Copyright (C) 1989, 1991-2015 Free Software Foundation. ... (You may see different version numbers than what's shown here. That's @@ -21998,7 +22096,7 @@ File: gawk.info, Node: FP Math Caution, Next: Arbitrary Precision Integers, P Math class is tough! -- Teen Talk Barbie, July 1992 - This minor node provides a high level overview of the issues + This minor node provides a high-level overview of the issues involved when doing lots of floating-point arithmetic.(1) The discussion applies to both hardware and arbitrary-precision floating-point arithmetic. @@ -22019,8 +22117,8 @@ floating-point arithmetic. (1) There is a very nice paper on floating-point arithmetic (http://www.validlab.com/goldberg/paper.pdf) by David Goldberg, "What -Every Computer Scientist Should Know About Floating-point Arithmetic," -`ACM Computing Surveys' *23*, 1 (1991-03), 5-48. This is worth reading +Every Computer Scientist Should Know About Floating-Point Arithmetic," +`ACM Computing Surveys' *23*, 1 (1991-03): 5-48. This is worth reading if you are interested in the details, but it does require a background in computer science. @@ -22074,7 +22172,7 @@ number as you assigned to it: Often the error is so small you do not even notice it, and if you do, you can always specify how much precision you would like in your output. -Usually this is a format string like `"%.15g"', which when used in the +Usually this is a format string like `"%.15g"', which, when used in the previous example, produces an output identical to the input. @@ -22114,7 +22212,7 @@ File: gawk.info, Node: Errors accumulate, Prev: Comparing FP Values, Up: Inex The loss of accuracy during a single computation with floating-point numbers usually isn't enough to worry about. However, if you compute a -value which is the result of a sequence of floating-point operations, +value that is the result of a sequence of floating-point operations, the error can accumulate and greatly affect the computation itself. Here is an attempt to compute the value of pi using one of its many series representations: @@ -22165,7 +22263,7 @@ easy answers. The standard rules of algebra often do not apply when using floating-point arithmetic. Among other things, the distributive and associative laws do not hold completely, and order of operation may be important for your computation. Rounding error, cumulative precision -loss and underflow are often troublesome. +loss, and underflow are often troublesome. When `gawk' tests the expressions `0.1 + 12.2' and `12.3' for equality using the machine double-precision arithmetic, it decides that @@ -22200,8 +22298,9 @@ illustrated by our earlier attempt to compute the value of pi. Extra precision can greatly enhance the stability and the accuracy of your computation in such cases. - Repeated addition is not necessarily equivalent to multiplication in -floating-point arithmetic. In the example in *note Errors accumulate::: + Additionally, you should understand that repeated addition is not +necessarily equivalent to multiplication in floating-point arithmetic. +In the example in *note Errors accumulate::: $ gawk 'BEGIN { > for (d = 1.1; d <= 1.5; d += 0.1) # loop five times (?) @@ -22256,7 +22355,7 @@ set the value to one of the predefined case-insensitive strings shown in *note table-predefined-precision-strings::, to emulate an IEEE 754 binary format. -`PREC' IEEE 754 Binary Format +`PREC' IEEE 754 binary format --------------------------------------------------- `"half"' 16-bit half-precision `"single"' Basic 32-bit single precision @@ -22289,14 +22388,14 @@ on arithmetic operations: example illustrates the differences among various ways to print a floating-point constant: - $ gawk -M 'BEGIN { PREC = 113; printf("%0.25f\n", 0.1) }' - -| 0.1000000000000000055511151 - $ gawk -M -v PREC=113 'BEGIN { printf("%0.25f\n", 0.1) }' - -| 0.1000000000000000000000000 - $ gawk -M 'BEGIN { PREC = 113; printf("%0.25f\n", "0.1") }' - -| 0.1000000000000000000000000 - $ gawk -M 'BEGIN { PREC = 113; printf("%0.25f\n", 1/10) }' - -| 0.1000000000000000000000000 + $ gawk -M 'BEGIN { PREC = 113; printf("%0.25f\n", 0.1) }' + -| 0.1000000000000000055511151 + $ gawk -M -v PREC=113 'BEGIN { printf("%0.25f\n", 0.1) }' + -| 0.1000000000000000000000000 + $ gawk -M 'BEGIN { PREC = 113; printf("%0.25f\n", "0.1") }' + -| 0.1000000000000000000000000 + $ gawk -M 'BEGIN { PREC = 113; printf("%0.25f\n", 1/10) }' + -| 0.1000000000000000000000000 File: gawk.info, Node: Setting the rounding mode, Prev: Setting precision, Up: FP Math Caution @@ -22304,15 +22403,15 @@ File: gawk.info, Node: Setting the rounding mode, Prev: Setting precision, Up 15.4.5 Setting the Rounding Mode -------------------------------- -The `ROUNDMODE' variable provides program level control over the +The `ROUNDMODE' variable provides program-level control over the rounding mode. The correspondence between `ROUNDMODE' and the IEEE rounding modes is shown in *note table-gawk-rounding-modes::. -Rounding Mode IEEE Name `ROUNDMODE' +Rounding mode IEEE name `ROUNDMODE' --------------------------------------------------------------------------- Round to nearest, ties to even `roundTiesToEven' `"N"' or `"n"' -Round toward plus Infinity `roundTowardPositive' `"U"' or `"u"' -Round toward negative Infinity `roundTowardNegative' `"D"' or `"d"' +Round toward positive infinity `roundTowardPositive' `"U"' or `"u"' +Round toward negative infinity `roundTowardNegative' `"D"' or `"d"' Round toward zero `roundTowardZero' `"Z"' or `"z"' Round to nearest, ties away `roundTiesToAway' `"A"' or `"a"' from zero @@ -22363,8 +22462,8 @@ distributes upward and downward rounds of exact halves, which might cause any accumulating round-off error to cancel itself out. This is the default rounding mode for IEEE 754 computing functions and operators. - The other rounding modes are rarely used. Round toward positive -infinity (`roundTowardPositive') and round toward negative infinity + The other rounding modes are rarely used. Rounding toward positive +infinity (`roundTowardPositive') and toward negative infinity (`roundTowardNegative') are often used to implement interval arithmetic, where you adjust the rounding mode to calculate upper and lower bounds for the range of output. The `roundTowardZero' mode can be @@ -22412,7 +22511,7 @@ floating-point values: If instead you were to compute the same value using arbitrary-precision floating-point values, the precision needed for -correct output (using the formula `prec = 3.322 * dps'), would be 3.322 +correct output (using the formula `prec = 3.322 * dps') would be 3.322 x 183231, or 608693. The result from an arithmetic operation with an integer and a @@ -22443,7 +22542,7 @@ interface to process arbitrary-precision integers or mixed-mode numbers as needed by an operation or function. In such a case, the precision is set to the minimum value necessary for exact conversion, and the working precision is not used for this purpose. If this is not what you need or -want, you can employ a subterfuge, and convert the integer to floating +want, you can employ a subterfuge and convert the integer to floating point first, like this: gawk -M 'BEGIN { n = 13; print (n + 0.0) % 2.0 }' @@ -22506,7 +22605,7 @@ set: It's not that well known but it's not that obscure either. It's Euler's modification to Newton's method for calculating pi. Take a look at lines (23) - (25) here: - `http://mathworld.wolfram.com/PiFormulas.htm'. + `http://mathworld.wolfram.com/PiFormulas.html'. The algorithm I wrote simply expands the multiply by 2 and works from the innermost expression outwards. I used this to program HP @@ -22526,7 +22625,7 @@ File: gawk.info, Node: POSIX Floating Point Problems, Next: Floating point sum 15.6 Standards Versus Existing Practice ======================================= -Historically, `awk' has converted any non-numeric looking string to the +Historically, `awk' has converted any nonnumeric-looking string to the numeric value zero, when required. Furthermore, the original definition of the language and the original POSIX standards specified that `awk' only understands decimal numbers (base 10), and not octal @@ -22540,8 +22639,8 @@ These features are: hexadecimal notation (e.g., `0xDEADBEEF'). (Note: data values, _not_ source code constants.) - * Support for the special IEEE 754 floating-point values "Not A - Number" (NaN), positive Infinity ("inf"), and negative Infinity + * Support for the special IEEE 754 floating-point values "not a + number" (NaN), positive infinity ("inf"), and negative infinity ("-inf"). In particular, the format for these values is as specified by the ISO 1999 C standard, which ignores case and can allow implementation-dependent additional characters after the @@ -22558,21 +22657,21 @@ historical practice: values is also a very severe departure from historical practice. The second problem is that the `gawk' maintainer feels that this -interpretation of the standard, which requires a certain amount of +interpretation of the standard, which required a certain amount of "language lawyering" to arrive at in the first place, was not even -intended by the standard developers. In other words, "we see how you +intended by the standard developers. In other words, "We see how you got where you are, but we don't think that that's where you want to be." Recognizing these issues, but attempting to provide compatibility with the earlier versions of the standard, the 2008 POSIX standard added explicit wording to allow, but not require, that `awk' support -hexadecimal floating-point values and special values for "Not A Number" +hexadecimal floating-point values and special values for "not a number" and infinity. Although the `gawk' maintainer continues to feel that providing those features is inadvisable, nevertheless, on systems that support IEEE floating point, it seems reasonable to provide _some_ way to -support NaN and Infinity values. The solution implemented in `gawk' is +support NaN and infinity values. The solution implemented in `gawk' is as follows: * With the `--posix' command-line option, `gawk' becomes "hands @@ -22587,7 +22686,7 @@ as follows: $ echo 0xDeadBeef | gawk --posix '{ print $1 + 0 }' -| 3735928559 - * Without `--posix', `gawk' interprets the four strings `+inf', + * Without `--posix', `gawk' interprets the four string values `+inf', `-inf', `+nan', and `-nan' specially, producing the corresponding special numeric values. The leading sign acts a signal to `gawk' (and the user) that the value is really numeric. Hexadecimal @@ -22601,7 +22700,7 @@ as follows: $ echo 0xDeadBeef | gawk '{ print $1 + 0 }' -| 0 - `gawk' ignores case in the four special values. Thus `+nan' and + `gawk' ignores case in the four special values. Thus, `+nan' and `+NaN' are the same. ---------- Footnotes ---------- @@ -22618,9 +22717,9 @@ File: gawk.info, Node: Floating point summary, Prev: POSIX Floating Point Prob floating-point values. Standard `awk' uses double-precision floating-point values. - * In the early 1990s, Barbie mistakenly said "Math class is tough!" + * In the early 1990s Barbie mistakenly said, "Math class is tough!" Although math isn't tough, floating-point arithmetic isn't the same - as pencil and paper math, and care must be taken: + as pencil-and-paper math, and care must be taken: - Not all numbers can be represented exactly. @@ -22641,18 +22740,18 @@ File: gawk.info, Node: Floating point summary, Prev: POSIX Floating Point Prob rounding mode. * With `-M', `gawk' performs arbitrary-precision integer arithmetic - using the GMP library. This is faster and more space efficient + using the GMP library. This is faster and more space-efficient than using MPFR for the same calculations. - * There are several "dark corners" with respect to floating-point - numbers where `gawk' disagrees with the POSIX standard. It pays - to be aware of them. + * There are several areas with respect to floating-point numbers + where `gawk' disagrees with the POSIX standard. It pays to be + aware of them. * Overall, there is no need to be unduly suspicious about the results from floating-point arithmetic. The lesson to remember is that floating-point arithmetic is always more complex than arithmetic using pencil and paper. In order to take advantage of - the power of computer floating point, you need to know its + the power of floating-point arithmetic, you need to know its limitations and work within them. For most casual use of floating-point arithmetic, you will often get the expected result if you simply round the display of your final results to the @@ -22711,7 +22810,7 @@ the rest of this Info file. `gawk''s functionality. For example, they can provide access to system calls (such as `chdir()' to change directory) and to other C library routines that could be of use. As with most software, "the sky is the -limit;" if you can imagine something that you might want to do and can +limit"; if you can imagine something that you might want to do and can write in C or C++, you can write an extension to do it! Extensions are written in C or C++, using the "application @@ -22719,7 +22818,7 @@ programming interface" (API) defined for this purpose by the `gawk' developers. The rest of this major node explains the facilities that the API provides and how to use them, and presents a small example extension. In addition, it documents the sample extensions included in -the `gawk' distribution, and describes the `gawkextlib' project. *Note +the `gawk' distribution and describes the `gawkextlib' project. *Note Extension Design::, for a discussion of the extension mechanism goals and design. @@ -22837,7 +22936,7 @@ Example::) and also in the `testext.c' code for testing the APIs. Some other bits and pieces: * The API provides access to `gawk''s `do_XXX' values, reflecting - command-line options, like `do_lint', `do_profiling' and so on + command-line options, like `do_lint', `do_profiling', and so on (*note Extension API Variables::). These are informational: an extension cannot affect their values inside `gawk'. In addition, attempting to assign to them produces a compile-time error. @@ -22883,8 +22982,8 @@ File: gawk.info, Node: Extension API Functions Introduction, Next: General Dat 16.4.1 Introduction ------------------- -Access to facilities within `gawk' are made available by calling -through function pointers passed into your extension. +Access to facilities within `gawk' is achieved by calling through +function pointers passed into your extension. API function pointers are provided for the following kinds of operations: @@ -22905,7 +23004,7 @@ operations: - Two-way processors - All of these are discussed in detail, later in this major node. + All of these are discussed in detail later in this major node. * Printing fatal, warning, and "lint" warning messages. @@ -22931,7 +23030,7 @@ operations: - Clearing an array - - Flattening an array for easy C style looping over all its + - Flattening an array for easy C-style looping over all its indices and elements Some points about using the API: @@ -22940,7 +23039,7 @@ operations: `gawkapi.h'. For correct use, you must therefore include the corresponding standard header file _before_ including `gawkapi.h': - C Entity Header File + C entity Header file ------------------------------------------- `EOF' `<stdio.h>' Values for `errno' `<errno.h>' @@ -22964,13 +23063,13 @@ operations: * Although the API only uses ISO C 90 features, there is an exception; the "constructor" functions use the `inline' keyword. If your compiler does not support this keyword, you should either - place `-Dinline=''' on your command line, or use the GNU Autotools + place `-Dinline=''' on your command line or use the GNU Autotools and include a `config.h' file in your extensions. * All pointers filled in by `gawk' point to memory managed by `gawk' and should be treated by the extension as read-only. Memory for _all_ strings passed into `gawk' from the extension _must_ come - from calling one of `gawk_malloc()', `gawk_calloc()' or + from calling one of `gawk_malloc()', `gawk_calloc()', or `gawk_realloc()', and is managed by `gawk' from then on. * The API defines several simple `struct's that map values as seen @@ -22983,7 +23082,7 @@ operations: multibyte encoding (as defined by `LC_XXX' environment variables) and not using wide characters. This matches how `gawk' stores strings internally and also how characters are - likely to be input and output from files. + likely to be input into and output from files. * When retrieving a value (such as a parameter or that of a global variable or array element), the extension requests a specific type @@ -23020,6 +23119,8 @@ general-purpose use. Additional, more specialized, data structures are introduced in subsequent minor nodes, together with the functions that use them. + The general-purpose types and structures are as follows: + `typedef void *awk_ext_id_t;' A value of this type is received from `gawk' when an extension is loaded. That value must then be passed back to `gawk' as the @@ -23035,7 +23136,7 @@ use them. ` awk_false = 0,' ` awk_true' `} awk_bool_t;' - A simple boolean type. + A simple Boolean type. `typedef struct awk_string {' ` char *str; /* data */' @@ -23079,8 +23180,8 @@ use them. `#define array_cookie u.a' `#define scalar_cookie u.scl' `#define value_cookie u.vc' - These macros make accessing the fields of the `awk_value_t' more - readable. + Using these macros makes accessing the fields of the `awk_value_t' + more readable. `typedef void *awk_scalar_t;' Scalars can be represented as an opaque type. These values are @@ -23100,8 +23201,8 @@ indicates what is in the `union'. Representing numbers is easy--the API uses a C `double'. Strings require more work. Because `gawk' allows embedded NUL bytes in string -values, a string must be represented as a pair containing a -data-pointer and length. This is the `awk_string_t' type. +values, a string must be represented as a pair containing a data +pointer and length. This is the `awk_string_t' type. Identifiers (i.e., the names of global variables) can be associated with either scalar values or with arrays. In addition, `gawk' provides @@ -23113,12 +23214,12 @@ Manipulation::. of the `union' as if they were fields in a `struct'; this is a common coding practice in C. Such code is easier to write and to read, but it remains _your_ responsibility to make sure that the `val_type' member -correctly reflects the type of the value in the `awk_value_t'. +correctly reflects the type of the value in the `awk_value_t' struct. Conceptually, the first three members of the `union' (number, string, and array) are all that is needed for working with `awk' values. However, because the API provides routines for accessing and changing -the value of global scalar variables only by using the variable's name, +the value of a global scalar variable only by using the variable's name, there is a performance penalty: `gawk' must find the variable each time it is accessed and changed. This turns out to be a real issue, not just a theoretical one. @@ -23127,17 +23228,19 @@ just a theoretical one. reading and/or changing the value of one or more scalar variables, you can obtain a "scalar cookie"(1) object for that variable, and then use the cookie for getting the variable's value or for changing the -variable's value. This is the `awk_scalar_t' type and `scalar_cookie' -macro. Given a scalar cookie, `gawk' can directly retrieve or modify -the value, as required, without having to find it first. +variable's value. The `awk_scalar_t' type holds a scalar cookie, and +the `scalar_cookie' macro provides access to the value of that type in +the `awk_value_t' struct. Given a scalar cookie, `gawk' can directly +retrieve or modify the value, as required, without having to find it +first. The `awk_value_cookie_t' type and `value_cookie' macro are similar. If you know that you wish to use the same numeric or string _value_ for one or more variables, you can create the value once, retaining a "value cookie" for it, and then pass in that value cookie whenever you -wish to set the value of a variable. This saves both storage space -within the running `gawk' process as well as the time needed to create -the value. +wish to set the value of a variable. This saves storage space within +the running `gawk' process and reduces the time needed to create the +value. ---------- Footnotes ---------- @@ -23172,7 +23275,7 @@ prototypes, in the way that extension code would use them: `void gawk_free(void *ptr);' Call the correct version of `free()' to release storage that was - allocated with `gawk_malloc()', `gawk_calloc()' or + allocated with `gawk_malloc()', `gawk_calloc()', or `gawk_realloc()'. The API has to provide these functions because it is possible for an @@ -23184,7 +23287,7 @@ version of `malloc()', unexpected behavior would likely result. Two convenience macros may be used for allocating storage from `gawk_malloc()' and `gawk_realloc()'. If the allocation fails, they cause `gawk' to exit with a fatal error message. They should be used -as if they were procedure calls that do not return a value. +as if they were procedure calls that do not return a value: `#define emalloc(pointer, type, size, message) ...' The arguments to this macro are as follows: @@ -23214,13 +23317,13 @@ as if they were procedure calls that do not return a value. make_malloced_string(message, strlen(message), & result); `#define erealloc(pointer, type, size, message) ...' - This is like `emalloc()', but it calls `gawk_realloc()', instead - of `gawk_malloc()'. The arguments are the same as for the + This is like `emalloc()', but it calls `gawk_realloc()' instead of + `gawk_malloc()'. The arguments are the same as for the `emalloc()' macro. ---------- Footnotes ---------- - (1) This is more common on MS-Windows systems, but can happen on + (1) This is more common on MS-Windows systems, but it can happen on Unix-like systems as well. @@ -23235,29 +23338,29 @@ This node presents them all as function prototypes, in the way that extension code would use them: `static inline awk_value_t *' -`make_const_string(const char *string, size_t length, awk_value_t *result)' +`make_const_string(const char *string, size_t length, awk_value_t *result);' This function creates a string value in the `awk_value_t' variable pointed to by `result'. It expects `string' to be a C string constant (or other string data), and automatically creates a _copy_ of the data for storage in `result'. It returns `result'. `static inline awk_value_t *' -`make_malloced_string(const char *string, size_t length, awk_value_t *result)' +`make_malloced_string(const char *string, size_t length, awk_value_t *result);' This function creates a string value in the `awk_value_t' variable pointed to by `result'. It expects `string' to be a `char *' value pointing to data previously obtained from `gawk_malloc()', - `gawk_calloc()' or `gawk_realloc()'. The idea here is that the + `gawk_calloc()', or `gawk_realloc()'. The idea here is that the data is passed directly to `gawk', which assumes responsibility for it. It returns `result'. `static inline awk_value_t *' -`make_null_string(awk_value_t *result)' +`make_null_string(awk_value_t *result);' This specialized function creates a null string (the "undefined" value) in the `awk_value_t' variable pointed to by `result'. It returns `result'. `static inline awk_value_t *' -`make_number(double num, awk_value_t *result)' +`make_number(double num, awk_value_t *result);' This function simply creates a numeric value in the `awk_value_t' variable pointed to by `result'. @@ -23296,7 +23399,7 @@ Extension functions are described by the following record: The fields are: `const char *name;' - The name of the new function. `awk' level code calls the function + The name of the new function. `awk'-level code calls the function by this name. This is a regular C string. Function names must obey the rules for `awk' identifiers. That is, @@ -23308,7 +23411,7 @@ Extension functions are described by the following record: This is a pointer to the C function that provides the extension's functionality. The function must fill in `*result' with either a number or a string. `gawk' takes ownership of any string memory. - As mentioned earlier, string memory *must* come from one of + As mentioned earlier, string memory _must_ come from one of `gawk_malloc()', `gawk_calloc()', or `gawk_realloc()'. The `num_actual_args' argument tells the C function how many @@ -23355,10 +23458,10 @@ function with `gawk' using the following function: `gawk' intends to pass to the `exit()' system call. `arg0' - A pointer to private data which `gawk' saves in order to pass + A pointer to private data that `gawk' saves in order to pass to the function pointed to by `funcp'. - Exit callback functions are called in last-in-first-out (LIFO) + Exit callback functions are called in last-in, first-out (LIFO) order--that is, in the reverse order in which they are registered with `gawk'. @@ -23368,8 +23471,8 @@ File: gawk.info, Node: Extension Version String, Next: Input Parsers, Prev: E 16.4.5.3 Registering An Extension Version String ................................................ -You can register a version string which indicates the name and version -of your extension, with `gawk', as follows: +You can register a version string that indicates the name and version +of your extension with `gawk', as follows: `void register_ext_version(const char *version);' Register the string pointed to by `version' with `gawk'. Note @@ -23392,7 +23495,7 @@ Files::). Additionally, it sets the value of `RT' (*note Built-in Variables::). If you want, you can provide your own custom input parser. An input -parser's job is to return a record to the `gawk' record processing +parser's job is to return a record to the `gawk' record-processing code, along with indicators for the value and length of the data to be used for `RT', if any. @@ -23409,10 +23512,10 @@ used for `RT', if any. `awk_bool_t XXX_take_control_of(awk_input_buf_t *iobuf);' When `gawk' decides to hand control of the file over to the input parser, it calls this function. This function in turn must fill - in certain fields in the `awk_input_buf_t' structure, and ensure + in certain fields in the `awk_input_buf_t' structure and ensure that certain conditions are true. It should then return true. If - an error of some kind occurs, it should not fill in any fields, - and should return false; then `gawk' will not use the input parser. + an error of some kind occurs, it should not fill in any fields and + should return false; then `gawk' will not use the input parser. The details are presented shortly. Your extension should package these functions inside an @@ -23489,10 +23592,10 @@ the `struct stat', or any combination of these factors. Once `XXX_can_take_file()' has returned true, and `gawk' has decided to use your input parser, it calls `XXX_take_control_of()'. That -function then fills one of either the `get_record' field or the -`read_func' field in the `awk_input_buf_t'. It must also ensure that -`fd' is _not_ set to `INVALID_HANDLE'. The following list describes -the fields that may be filled by `XXX_take_control_of()': +function then fills either the `get_record' field or the `read_func' +field in the `awk_input_buf_t'. It must also ensure that `fd' is _not_ +set to `INVALID_HANDLE'. The following list describes the fields that +may be filled by `XXX_take_control_of()': `void *opaque;' This is used to hold any state information needed by the input @@ -23509,22 +23612,22 @@ the fields that may be filled by `XXX_take_control_of()': Its behavior is described in the text following this list. `ssize_t (*read_func)();' - This function pointer should point to function that has the same + This function pointer should point to a function that has the same behavior as the standard POSIX `read()' system call. It is an alternative to the `get_record' pointer. Its behavior is also described in the text following this list. `void (*close_func)(struct awk_input *iobuf);' This function pointer should point to a function that does the - "tear down." It should release any resources allocated by + "teardown." It should release any resources allocated by `XXX_take_control_of()'. It may also close the file. If it does so, it should set the `fd' field to `INVALID_HANDLE'. If `fd' is still not `INVALID_HANDLE' after the call to this function, `gawk' calls the regular `close()' system call. - Having a "tear down" function is optional. If your input parser - does not need it, do not set this field. Then, `gawk' calls the + Having a "teardown" function is optional. If your input parser does + not need it, do not set this field. Then, `gawk' calls the regular `close()' system call on the file descriptor, so it should be valid. @@ -23532,7 +23635,7 @@ the fields that may be filled by `XXX_take_control_of()': records. The parameters are as follows: `char **out' - This is a pointer to a `char *' variable which is set to point to + This is a pointer to a `char *' variable that is set to point to the record. `gawk' makes its own copy of the data, so the extension must manage this storage. @@ -23581,16 +23684,16 @@ explicitly. NOTE: You must choose one method or the other: either a function that returns a record, or one that returns raw data. In particular, if you supply a function to get a record, `gawk' will - call it, and never call the raw read function. + call it, and will never call the raw read function. `gawk' ships with a sample extension that reads directories, -returning records for each entry in the directory (*note Extension -Sample Readdir::). You may wish to use that code as a guide for writing -your own input parser. +returning records for each entry in a directory (*note Extension Sample +Readdir::). You may wish to use that code as a guide for writing your +own input parser. When writing an input parser, you should think about (and document) how it is expected to interact with `awk' code. You may want it to -always be called, and take effect as appropriate (as the `readdir' +always be called, and to take effect as appropriate (as the `readdir' extension does). Or you may want it to take effect based upon the value of an `awk' variable, as the XML extension from the `gawkextlib' project does (*note gawkextlib::). In the latter case, code in a @@ -23690,17 +23793,17 @@ in the `awk_output_buf_t'. The data members are as follows: These pointers should be set to point to functions that perform the equivalent function as the `<stdio.h>' functions do, if appropriate. `gawk' uses these function pointers for all output. - `gawk' initializes the pointers to point to internal, "pass - through" functions that just call the regular `<stdio.h>' - functions, so an extension only needs to redefine those functions - that are appropriate for what it does. + `gawk' initializes the pointers to point to internal "pass-through" + functions that just call the regular `<stdio.h>' functions, so an + extension only needs to redefine those functions that are + appropriate for what it does. The `XXX_can_take_file()' function should make a decision based upon the `name' and `mode' fields, and any additional state (such as `awk' variable values) that is appropriate. When `gawk' calls `XXX_take_control_of()', that function should fill -in the other fields, as appropriate, except for `fp', which it should +in the other fields as appropriate, except for `fp', which it should just use normally. You register your output wrapper with the following function: @@ -23737,16 +23840,17 @@ structures as described earlier. The name of the two-way processor. `awk_bool_t (*can_take_two_way)(const char *name);' - This function returns true if it wants to take over two-way I/O - for this file name. It should not change any state (variable - values, etc.) within `gawk'. + The function pointed to by this field should return true if it + wants to take over two-way I/O for this file name. It should not + change any state (variable values, etc.) within `gawk'. `awk_bool_t (*take_control_of)(const char *name,' ` awk_input_buf_t *inbuf,' ` awk_output_buf_t *outbuf);' - This function should fill in the `awk_input_buf_t' and - `awk_outut_buf_t' structures pointed to by `inbuf' and `outbuf', - respectively. These structures were described earlier. + The function pointed to by this field should fill in the + `awk_input_buf_t' and `awk_outut_buf_t' structures pointed to by + `inbuf' and `outbuf', respectively. These structures were + described earlier. `awk_const struct two_way_processor *awk_const next;' This is for use by `gawk'; therefore it is marked `awk_const' so @@ -23770,7 +23874,7 @@ File: gawk.info, Node: Printing Messages, Next: Updating `ERRNO', Prev: Regis You can print different kinds of warning messages from your extension, as described here. Note that for these functions, you must pass in the -extension id received from `gawk' when the extension was loaded:(1) +extension ID received from `gawk' when the extension was loaded:(1) `void fatal(awk_ext_id_t id, const char *format, ...);' Print a message and then cause `gawk' to exit immediately. @@ -23826,7 +23930,7 @@ value you expect. If the actual value matches what you requested, the function returns true and fills in the `awk_value_t' result. Otherwise, the function returns false, and the `val_type' member indicates the type of the actual value. You may then print an error -message, or reissue the request for the actual value type, as +message or reissue the request for the actual value type, as appropriate. This behavior is summarized in *note table-value-types-returned::. @@ -23835,15 +23939,15 @@ table-value-types-returned::. String Number Array Undefined ------------------------------------------------------------------------------ - String String String false false - Number Number if can Number false false + String String String False False + Number Number if can Number False False be converted, else false -Type Array false false Array false -Requested Scalar Scalar Scalar false false +Type Array False False Array False +Requested Scalar Scalar Scalar False False Undefined String Number Array Undefined - Value false false false false - Cookie + Value False False False False + cookie Table 16.1: API value types returned @@ -23860,16 +23964,16 @@ your extension function. They are: ` awk_valtype_t wanted,' ` awk_value_t *result);' Fill in the `awk_value_t' structure pointed to by `result' with - the `count''th argument. Return true if the actual type matches - `wanted', false otherwise. In the latter case, `result->val_type' - indicates the actual type (*note Table 16.1: - table-value-types-returned.). Counts are zero based--the first + the `count'th argument. Return true if the actual type matches + `wanted', and false otherwise. In the latter case, + `result->val_type' indicates the actual type (*note Table 16.1: + table-value-types-returned.). Counts are zero-based--the first argument is numbered zero, the second one, and so on. `wanted' indicates the type of value expected. `awk_bool_t set_argument(size_t count, awk_array_t array);' Convert a parameter that was undefined into an array; this provides - call-by-reference for arrays. Return false if `count' is too big, + call by reference for arrays. Return false if `count' is too big, or if the argument's type is not undefined. *Note Array Manipulation::, for more information on creating arrays. @@ -23897,8 +24001,8 @@ File: gawk.info, Node: Symbol table by name, Next: Symbol table by cookie, Up The following routines provide the ability to access and update global `awk'-level variables by name. In compiler terminology, identifiers of different kinds are termed "symbols", thus the "sym" in the routines' -names. The data structure which stores information about symbols is -termed a "symbol table". +names. The data structure that stores information about symbols is +termed a "symbol table". The functions are as follows: `awk_bool_t sym_lookup(const char *name,' ` awk_valtype_t wanted,' @@ -23906,14 +24010,14 @@ termed a "symbol table". Fill in the `awk_value_t' structure pointed to by `result' with the value of the variable named by the string `name', which is a regular C string. `wanted' indicates the type of value expected. - Return true if the actual type matches `wanted', false otherwise. - In the latter case, `result->val_type' indicates the actual type - (*note Table 16.1: table-value-types-returned.). + Return true if the actual type matches `wanted', and false + otherwise. In the latter case, `result->val_type' indicates the + actual type (*note Table 16.1: table-value-types-returned.). `awk_bool_t sym_update(const char *name, awk_value_t *value);' Update the variable named by the string `name', which is a regular C string. The variable is added to `gawk''s symbol table if it is - not there. Return true if everything worked, false otherwise. + not there. Return true if everything worked, and false otherwise. Changing types (scalar to array or vice versa) of an existing variable is _not_ allowed, nor may this routine be used to update @@ -23938,7 +24042,7 @@ File: gawk.info, Node: Symbol table by cookie, Next: Cached values, Prev: Sym A "scalar cookie" is an opaque handle that provides access to a global variable or array. It is an optimization that avoids looking up variables in `gawk''s symbol table every time access is needed. This -was discussed earlier in *note General Data Types::. +was discussed earlier, in *note General Data Types::. The following functions let you work with scalar cookies: @@ -24049,7 +24153,7 @@ File: gawk.info, Node: Cached values, Prev: Symbol table by cookie, Up: Symbo .......................................... The routines in this section allow you to create and release cached -values. As with scalar cookies, in theory, cached values are not +values. Like scalar cookies, in theory, cached values are not necessary. You can create numbers and strings using the functions in *note Constructor Functions::. You can then assign those values to variables using `sym_update()' or `sym_update_scalar()', as you like. @@ -24120,7 +24224,7 @@ Using value cookies in this way saves considerable storage, as all of `VAR1' through `VAR100' share the same value. You might be wondering, "Is this sharing problematic? What happens -if `awk' code assigns a new value to `VAR1', are all the others changed +if `awk' code assigns a new value to `VAR1'; are all the others changed too?" That's a great question. The answer is that no, it's not a problem. @@ -24239,7 +24343,7 @@ File: gawk.info, Node: Array Functions, Next: Flattening Arrays, Prev: Array 16.4.11.2 Array Functions ......................... -The following functions relate to individual array elements. +The following functions relate to individual array elements: `awk_bool_t get_element_count(awk_array_t a_cookie, size_t *count);' For the array represented by `a_cookie', place in `*count' the @@ -24257,13 +24361,14 @@ The following functions relate to individual array elements. (*note Table 16.1: table-value-types-returned.). The value for `index' can be numeric, in which case `gawk' - converts it to a string. Using non-integral values is possible, but + converts it to a string. Using nonintegral values is possible, but requires that you understand how such values are converted to - strings (*note Conversion::); thus using integral values is safest. + strings (*note Conversion::); thus, using integral values is + safest. As with _all_ strings passed into `gawk' from an extension, the string value of `index' must come from `gawk_malloc()', - `gawk_calloc()' or `gawk_realloc()', and `gawk' releases the + `gawk_calloc()', or `gawk_realloc()', and `gawk' releases the storage. `awk_bool_t set_array_element(awk_array_t a_cookie,' @@ -24307,9 +24412,9 @@ The following functions relate to individual array elements. `awk_bool_t release_flattened_array(awk_array_t a_cookie,' ` awk_flat_array_t *data);' When done with a flattened array, release the storage using this - function. You must pass in both the original array cookie, and - the address of the created `awk_flat_array_t' structure. The - function returns true upon success, false otherwise. + function. You must pass in both the original array cookie and the + address of the created `awk_flat_array_t' structure. The function + returns true upon success, false otherwise. File: gawk.info, Node: Flattening Arrays, Next: Creating Arrays, Prev: Array Functions, Up: Array Manipulation @@ -24319,8 +24424,8 @@ File: gawk.info, Node: Flattening Arrays, Next: Creating Arrays, Prev: Array To "flatten" an array is to create a structure that represents the full array in a fashion that makes it easy for C code to traverse the entire -array. Test code in `extension/testext.c' does this, and also serves -as a nice example showing how to use the APIs. +array. Some of the code in `extension/testext.c' does this, and also +serves as a nice example showing how to use the APIs. We walk through that part of the code one step at a time. First, the `gawk' script that drives the test extension: @@ -24369,9 +24474,8 @@ number of arguments: } The function then proceeds in steps, as follows. First, retrieve the -name of the array, passed as the first argument. Then retrieve the -array itself. If either operation fails, print error messages and -return: +name of the array, passed as the first argument, followed by the array +itself. If either operation fails, print an error message and return: /* get argument named array as flat array and print it */ if (get_argument(0, AWK_STRING, & value)) { @@ -24401,9 +24505,9 @@ count of elements in the array and print it: printf("dump_array_and_delete: incoming size is %lu\n", (unsigned long) count); - The third step is to actually flatten the array, and then to double -check that the count in the `awk_flat_array_t' is the same as the count -just retrieved: + The third step is to actually flatten the array, and then to +double-check that the count in the `awk_flat_array_t' is the same as +the count just retrieved: if (! flatten_array(value2.array_cookie, & flat_array)) { printf("dump_array_and_delete: could not flatten array\n"); @@ -24420,7 +24524,7 @@ just retrieved: The fourth step is to retrieve the index of the element to be deleted, which was passed as the second argument. Remember that -argument counts passed to `get_argument()' are zero-based, thus the +argument counts passed to `get_argument()' are zero-based, and thus the second argument is numbered one: if (! get_argument(1, AWK_STRING, & value3)) { @@ -24433,7 +24537,7 @@ over every element in the array, printing the index and element values. In addition, upon finding the element with the index that is supposed to be deleted, the function sets the `AWK_ELEMENT_DELETE' bit in the `flags' field of the element. When the array is released, `gawk' -traverses the flattened array, and deletes any elements which have this +traverses the flattened array, and deletes any elements that have this flag bit set: for (i = 0; i < flat_array->count; i++) { @@ -24651,10 +24755,10 @@ The API provides both a "major" and a "minor" version number. The API versions are available at compile time as constants: `GAWK_API_MAJOR_VERSION' - The major version of the API. + The major version of the API `GAWK_API_MINOR_VERSION' - The minor version of the API. + The minor version of the API The minor version increases when new functions are added to the API. Such new functions are always added to the end of the API `struct'. @@ -24669,13 +24773,13 @@ For this reason, the major and minor API versions of the running `gawk' are included in the API `struct' as read-only constant integers: `api->major_version' - The major version of the running `gawk'. + The major version of the running `gawk' `api->minor_version' - The minor version of the running `gawk'. + The minor version of the running `gawk' It is up to the extension to decide if there are API -incompatibilities. Typically a check like this is enough: +incompatibilities. Typically, a check like this is enough: if (api->major_version != GAWK_API_MAJOR_VERSION || api->minor_version < GAWK_API_MINOR_VERSION) { @@ -24687,7 +24791,7 @@ incompatibilities. Typically a check like this is enough: } Such code is included in the boilerplate `dl_load_func()' macro -provided in `gawkapi.h' (discussed later, in *note Extension API +provided in `gawkapi.h' (discussed in *note Extension API Boilerplate::). @@ -24738,7 +24842,7 @@ functions) toward the top of your source file, using predefined names as described here. The boilerplate needed is also provided in comments in the `gawkapi.h' header file: - /* Boiler plate code: */ + /* Boilerplate code: */ int plugin_is_GPL_compatible; static gawk_api_t *const api; @@ -24788,7 +24892,7 @@ in the `gawkapi.h' header file: to point to a string giving the name and version of your extension. `static awk_ext_func_t func_table[] = { ... };' - This is an array of one or more `awk_ext_func_t' structures as + This is an array of one or more `awk_ext_func_t' structures, as described earlier (*note Extension Functions::). It can then be looped over for multiple calls to `add_ext_func()'. @@ -24901,7 +25005,7 @@ appropriate information: `stat()' fails. It fills in the following elements: `"name"' - The name of the file that was `stat()''ed. + The name of the file that was `stat()'ed. `"dev"' `"ino"' @@ -24949,7 +25053,7 @@ appropriate information: The file is a directory. `"fifo"' - The file is a named-pipe (also known as a FIFO). + The file is a named pipe (also known as a FIFO). `"file"' The file is just a regular file. @@ -24969,7 +25073,7 @@ appropriate information: systems, "a priori" knowledge is used to provide a value. Where no value can be determined, it defaults to 512. - Several additional elements may be present depending upon the + Several additional elements may be present, depending upon the operating system and the type of the file. You can test for them in your `awk' program by using the `in' operator (*note Reference to Elements::): @@ -24998,9 +25102,9 @@ File: gawk.info, Node: Internal File Ops, Next: Using Internal File Ops, Prev Here is the C code for these extensions.(1) The file includes a number of standard header files, and then -includes the `gawkapi.h' header file which provides the API definitions. -Those are followed by the necessary variable declarations to make use -of the API macros and boilerplate code (*note Extension API +includes the `gawkapi.h' header file, which provides the API +definitions. Those are followed by the necessary variable declarations +to make use of the API macros and boilerplate code (*note Extension API Boilerplate::): #ifdef HAVE_CONFIG_H @@ -25036,9 +25140,9 @@ Boilerplate::): By convention, for an `awk' function `foo()', the C function that implements it is called `do_foo()'. The function should have two -arguments: the first is an `int' usually called `nargs', that +arguments. The first is an `int', usually called `nargs', that represents the number of actual arguments for the function. The second -is a pointer to an `awk_value_t', usually named `result': +is a pointer to an `awk_value_t' structure, usually named `result': /* do_chdir --- provide dynamically loaded chdir() function for gawk */ @@ -25074,8 +25178,8 @@ is numbered zero. } The `stat()' extension is more involved. First comes a function -that turns a numeric mode into a printable representation (e.g., 644 -becomes `-rw-r--r--'). This is omitted here for brevity: +that turns a numeric mode into a printable representation (e.g., octal +`0644' becomes `-rw-r--r--'). This is omitted here for brevity: /* format_mode --- turn a stat mode field into something readable */ @@ -25125,8 +25229,8 @@ contain the result of the `stat()': The following function does most of the work to fill in the `awk_array_t' result array with values obtained from a valid `struct -stat'. It is done in a separate function to support the `stat()' -function for `gawk' and also to support the `fts()' extension which is +stat'. This work is done in a separate function to support the `stat()' +function for `gawk' and also to support the `fts()' extension, which is included in the same file but whose code is not shown here (*note Extension Sample File Functions::). @@ -25238,8 +25342,8 @@ argument is optional. If present, it causes `do_stat()' to use the `stat()' system call instead of the `lstat()' system call. This is done by using a function pointer: `statfunc'. `statfunc' is initialized to point to `lstat()' (instead of `stat()') to get the file -information, in case the file is a symbolic link. However, if there -were three arguments, `statfunc' is set point to `stat()', instead. +information, in case the file is a symbolic link. However, if the third +argument is included, `statfunc' is set to point to `stat()', instead. Here is the `do_stat()' function, which starts with variable declarations and argument checking: @@ -25288,7 +25392,7 @@ returns: /* always empty out the array */ clear_array(array); - /* stat the file, if error, set ERRNO and return */ + /* stat the file; if error, set ERRNO and return */ ret = statfunc(name, & sbuf); if (ret < 0) { update_ERRNO_int(errno); @@ -25307,7 +25411,8 @@ When done, the function returns the result from `fill_stat_array()': function(s) into `gawk'. The `filefuncs' extension also provides an `fts()' function, which -we omit here. For its sake there is an initialization function: +we omit here (*note Extension Sample File Functions::). For its sake, +there is an initialization function: /* init_filefuncs --- initialization routine */ @@ -25431,9 +25536,9 @@ File: gawk.info, Node: Extension Samples, Next: gawkextlib, Prev: Extension E 16.7 The Sample Extensions in the `gawk' Distribution ===================================================== -This minor node provides brief overviews of the sample extensions that +This minor node provides a brief overview of the sample extensions that come in the `gawk' distribution. Some of them are intended for -production use (e.g., the `filefuncs', `readdir' and `inplace' +production use (e.g., the `filefuncs', `readdir', and `inplace' extensions). Others mainly provide example code that shows how to use the extension API. @@ -25470,13 +25575,13 @@ follows. The usage is: `result = chdir("/some/directory")' The `chdir()' function is a direct hook to the `chdir()' system call to change the current directory. It returns zero upon - success or less than zero upon error. In the latter case, it - updates `ERRNO'. + success or a value less than zero upon error. In the latter case, + it updates `ERRNO'. `result = stat("/some/path", statdata' [`, follow']`)' The `stat()' function provides a hook into the `stat()' system - call. It returns zero upon success or less than zero upon error. - In the latter case, it updates `ERRNO'. + call. It returns zero upon success or a value less than zero upon + error. In the latter case, it updates `ERRNO'. By default, it uses the `lstat()' system call. However, if passed a third argument, it uses `stat()' instead. @@ -25503,23 +25608,23 @@ follows. The usage is: `"minor"' `st_minor' Device files `"blksize"'`st_blksize' All `"pmode"' A human-readable version of the All - mode value, such as printed by - `ls'. For example, - `"-rwxr-xr-x"' + mode value, like that printed by + `ls' (for example, + `"-rwxr-xr-x"') `"linkval"'The value of the symbolic link Symbolic links - `"type"' The type of the file as a string. All - One of `"file"', `"blockdev"', - `"chardev"', `"directory"', - `"socket"', `"fifo"', `"symlink"', - `"door"', or `"unknown"'. Not - all systems support all file - types. + `"type"' The type of the file as a All + string--one of `"file"', + `"blockdev"', `"chardev"', + `"directory"', `"socket"', + `"fifo"', `"symlink"', `"door"', + or `"unknown"' (not all systems + support all file types) `flags = or(FTS_PHYSICAL, ...)' `result = fts(pathlist, flags, filedata)' Walk the file trees provided in `pathlist' and fill in the - `filedata' array as described next. `flags' is the bitwise OR of + `filedata' array, as described next. `flags' is the bitwise OR of several predefined values, also described in a moment. Return zero if there were no errors, otherwise return -1. @@ -25572,10 +25677,11 @@ requested hierarchies. filesystem. `filedata' - The `filedata' array is first cleared. Then, `fts()' creates an - element in `filedata' for every element in `pathlist'. The index - is the name of the directory or file given in `pathlist'. The - element for this index is itself an array. There are two cases: + The `filedata' array holds the results. `fts()' first clears it. + Then it creates an element in `filedata' for every element in + `pathlist'. The index is the name of the directory or file given + in `pathlist'. The element for this index is itself an array. + There are two cases: _The path is a file_ In this case, the array contains two or three elements: @@ -25611,7 +25717,7 @@ requested hierarchies. elements as for a file: `"path"', `"stat"', and `"error"'. The `fts()' function returns zero if there were no errors. -Otherwise it returns -1. +Otherwise, it returns -1. NOTE: The `fts()' extension does not exactly mimic the interface of the C library `fts()' routines, choosing instead to provide an @@ -25650,14 +25756,14 @@ adds one constant (`FNM_NOMATCH'), and an array of flag values named The arguments to `fnmatch()' are: `pattern' - The file name wildcard to match. + The file name wildcard to match `string' - The file name string. + The file name string `flag' Either zero, or the bitwise OR of one or more of the flags in the - `FNM' array. + `FNM' array The flags are as follows: @@ -25691,13 +25797,13 @@ The `fork' extension adds three functions, as follows: `pid = fork()' This function creates a new process. The return value is zero in - the child and the process-ID number of the child in the parent, or + the child and the process ID number of the child in the parent, or -1 upon error. In the latter case, `ERRNO' indicates the problem. In the child, `PROCINFO["pid"]' and `PROCINFO["ppid"]' are updated to reflect the correct values. `ret = waitpid(pid)' - This function takes a numeric argument, which is the process-ID to + This function takes a numeric argument, which is the process ID to wait for. The return value is that of the `waitpid()' system call. `ret = wait()' @@ -25721,8 +25827,8 @@ File: gawk.info, Node: Extension Sample Inplace, Next: Extension Sample Ord, 16.7.4 Enabling In-Place File Editing ------------------------------------- -The `inplace' extension emulates GNU `sed''s `-i' option which performs -"in place" editing of each input file. It uses the bundled +The `inplace' extension emulates GNU `sed''s `-i' option, which +performs "in-place" editing of each input file. It uses the bundled `inplace.awk' include file to invoke the extension properly: # inplace --- load and invoke the inplace extension. @@ -25805,11 +25911,11 @@ returned as a record. The record consists of three fields. The first two are the inode number and the file name, separated by a forward slash character. On systems where the directory entry contains the file type, the record -has a third field (also separated by a slash) which is a single letter +has a third field (also separated by a slash), which is a single letter indicating the type of the file. The letters and their corresponding file types are shown in *note table-readdir-file-types::. -Letter File Type +Letter File type -------------------------------------------------------------------------- `b' Block device `c' Character device @@ -25856,7 +25962,7 @@ unwary. Here is an example: print "don't panic" > "/dev/stdout" } - The output from this program is: `cinap t'nod'. + The output from this program is `cinap t'nod'. File: gawk.info, Node: Extension Sample Rev2way, Next: Extension Sample Read write array, Prev: Extension Sample Revout, Up: Extension Samples @@ -25904,7 +26010,7 @@ The `rwarray' extension adds two functions, named `writea()' and `reada()' is the inverse of `writea()'; it reads the file named as its first argument, filling in the array named as the second argument. It clears the array first. Here too, the return value - is one on success and zero upon failure. + is one on success, or zero upon failure. The array created by `reada()' is identical to that written by `writea()' in the sense that the contents are the same. However, due to @@ -25988,7 +26094,7 @@ The `time' extension adds two functions, named `gettimeofday()' and Attempt to sleep for SECONDS seconds. If SECONDS is negative, or the attempt to sleep fails, return -1 and set `ERRNO'. Otherwise, return zero after sleeping for the indicated amount of time. Note - that SECONDS may be a floating-point (non-integral) value. + that SECONDS may be a floating-point (nonintegral) value. Implementation details: depending on platform availability, this function tries to use `nanosleep()' or `select()' to implement the delay. @@ -26016,7 +26122,9 @@ provides a number of `gawk' extensions, including one for processing XML files. This is the evolution of the original `xgawk' (XML `gawk') project. - As of this writing, there are six extensions: + As of this writing, there are seven extensions: + + * `errno' extension * GD graphics library extension @@ -26025,7 +26133,7 @@ project. * PostgreSQL extension * MPFR library extension (this provides access to a number of MPFR - functions which `gawk''s native MPFR support does not) + functions that `gawk''s native MPFR support does not) * Redis extension @@ -26066,7 +26174,7 @@ follows. First, build and install `gawk': If you have installed `gawk' in the standard way, then you will likely not need the `--with-gawk' option when configuring `gawkextlib'. -You may also need to use the `sudo' utility to install both `gawk' and +You may need to use the `sudo' utility to install both `gawk' and `gawkextlib', depending upon how your system works. If you write an extension that you wish to share with other `gawk' @@ -26088,7 +26196,7 @@ File: gawk.info, Node: Extension summary, Next: Extension Exercises, Prev: ga a variable named `plugin_is_GPL_compatible'. * Communication between `gawk' and an extension is two-way. `gawk' - passes a `struct' to the extension which contains various data + passes a `struct' to the extension that contains various data fields and function pointers. The extension can then call into `gawk' via the supplied function pointers to accomplish certain tasks. @@ -26099,7 +26207,7 @@ File: gawk.info, Node: Extension summary, Next: Extension Exercises, Prev: ga convention, implementation functions are named `do_XXXX()' for some `awk'-level function `XXXX()'. - * The API is defined in a header file named `gawkpi.h'. You must + * The API is defined in a header file named `gawkapi.h'. You must include a number of standard header files _before_ including it in your source file. @@ -26129,16 +26237,16 @@ File: gawk.info, Node: Extension summary, Next: Extension Exercises, Prev: ga * Manipulating arrays (retrieving, adding, deleting, and modifying elements; getting the count of elements in an array; creating a new array; clearing an array; and flattening an - array for easy C style looping over all its indices and + array for easy C-style looping over all its indices and elements) * The API defines a number of standard data types for representing `awk' values, array elements, and arrays. - * The API provide convenience functions for constructing values. It - also provides memory management functions to ensure compatibility - between memory allocated by `gawk' and memory allocated by an - extension. + * The API provides convenience functions for constructing values. + It also provides memory management functions to ensure + compatibility between memory allocated by `gawk' and memory + allocated by an extension. * _All_ memory passed from `gawk' to an extension must be treated as read-only by the extension. @@ -26156,8 +26264,8 @@ File: gawk.info, Node: Extension summary, Next: Extension Exercises, Prev: ga header file make this easier to do. * The `gawk' distribution includes a number of small but useful - sample extensions. The `gawkextlib' project includes several more, - larger, extensions. If you wish to write an extension and + sample extensions. The `gawkextlib' project includes several more + (larger) extensions. If you wish to write an extension and contribute it to the community of `gawk' users, the `gawkextlib' project is the place to do so. @@ -26227,56 +26335,56 @@ available in System V Release 3.1 (1987). This minor node summarizes the changes, with cross-references to further details: * The requirement for `;' to separate rules on a line (*note - Statements/Lines::). + Statements/Lines::) * User-defined functions and the `return' statement (*note - User-defined::). + User-defined::) * The `delete' statement (*note Delete::). - * The `do'-`while' statement (*note Do Statement::). + * The `do'-`while' statement (*note Do Statement::) * The built-in functions `atan2()', `cos()', `sin()', `rand()', and - `srand()' (*note Numeric Functions::). + `srand()' (*note Numeric Functions::) * The built-in functions `gsub()', `sub()', and `match()' (*note - String Functions::). + String Functions::) * The built-in functions `close()' and `system()' (*note I/O - Functions::). + Functions::) * The `ARGC', `ARGV', `FNR', `RLENGTH', `RSTART', and `SUBSEP' - predefined variables (*note Built-in Variables::). + predefined variables (*note Built-in Variables::) - * Assignable `$0' (*note Changing Fields::). + * Assignable `$0' (*note Changing Fields::) * The conditional expression using the ternary operator `?:' (*note - Conditional Exp::). + Conditional Exp::) - * The expression `INDEX-VARIABLE in ARRAY' outside of `for' - statements (*note Reference to Elements::). + * The expression `INDX in ARRAY' outside of `for' statements (*note + Reference to Elements::) * The exponentiation operator `^' (*note Arithmetic Ops::) and its - assignment operator form `^=' (*note Assignment Ops::). + assignment operator form `^=' (*note Assignment Ops::) * C-compatible operator precedence, which breaks some old `awk' - programs (*note Precedence::). + programs (*note Precedence::) * Regexps as the value of `FS' (*note Field Separators::) and as the third argument to the `split()' function (*note String - Functions::), rather than using only the first character of `FS'. + Functions::), rather than using only the first character of `FS' * Dynamic regexps as operands of the `~' and `!~' operators (*note - Computed Regexps::). + Computed Regexps::) * The escape sequences `\b', `\f', and `\r' (*note Escape - Sequences::). + Sequences::) - * Redirection of input for the `getline' function (*note Getline::). + * Redirection of input for the `getline' function (*note Getline::) - * Multiple `BEGIN' and `END' rules (*note BEGIN/END::). + * Multiple `BEGIN' and `END' rules (*note BEGIN/END::) - * Multidimensional arrays (*note Multidimensional::). + * Multidimensional arrays (*note Multidimensional::) File: gawk.info, Node: SVR4, Next: POSIX, Prev: V7/SVR3.1, Up: Language History @@ -26287,37 +26395,37 @@ A.2 Changes Between SVR3.1 and SVR4 The System V Release 4 (1989) version of Unix `awk' added these features (some of which originated in `gawk'): - * The `ENVIRON' array (*note Built-in Variables::). + * The `ENVIRON' array (*note Built-in Variables::) - * Multiple `-f' options on the command line (*note Options::). + * Multiple `-f' options on the command line (*note Options::) * The `-v' option for assigning variables before program execution - begins (*note Options::). + begins (*note Options::) - * The `--' signal for terminating command-line options. + * The `--' signal for terminating command-line options * The `\a', `\v', and `\x' escape sequences (*note Escape - Sequences::). + Sequences::) * A defined return value for the `srand()' built-in function (*note - Numeric Functions::). + Numeric Functions::) * The `toupper()' and `tolower()' built-in string functions for case - translation (*note String Functions::). + translation (*note String Functions::) * A cleaner specification for the `%c' format-control letter in the - `printf' function (*note Control Letters::). + `printf' function (*note Control Letters::) * The ability to dynamically pass the field width and precision (`"%*.*d"') in the argument list of `printf' and `sprintf()' - (*note Control Letters::). + (*note Control Letters::) * The use of regexp constants, such as `/foo/', as expressions, where they are equivalent to using the matching operator, as in `$0 ~ - /foo/' (*note Using Constant Regexps::). + /foo/' (*note Using Constant Regexps::) * Processing of escape sequences inside command-line variable - assignments (*note Assignment Options::). + assignments (*note Assignment Options::) File: gawk.info, Node: POSIX, Next: BTL, Prev: SVR4, Up: Language History @@ -26329,30 +26437,30 @@ The POSIX Command Language and Utilities standard for `awk' (1992) introduced the following changes into the language: * The use of `-W' for implementation-specific options (*note - Options::). + Options::) * The use of `CONVFMT' for controlling the conversion of numbers to - strings (*note Conversion::). + strings (*note Conversion::) * The concept of a numeric string and tighter comparison rules to go - with it (*note Typing and Comparison::). + with it (*note Typing and Comparison::) * The use of predefined variables as function parameter names is - forbidden (*note Definition Syntax::). + forbidden (*note Definition Syntax::) * More complete documentation of many of the previously undocumented - features of the language. + features of the language In 2012, a number of extensions that had been commonly available for many years were finally added to POSIX. They are: * The `fflush()' built-in function for flushing buffered output - (*note I/O Functions::). + (*note I/O Functions::) - * The `nextfile' statement (*note Nextfile Statement::). + * The `nextfile' statement (*note Nextfile Statement::) * The ability to delete all of an array at once with `delete ARRAY' - (*note Delete::). + (*note Delete::) *Note Common Extensions::, for a list of common extensions not @@ -26374,13 +26482,13 @@ Other Versions::). in his version of `awk': * The `**' and `**=' operators (*note Arithmetic Ops:: and *note - Assignment Ops::). + Assignment Ops::) * The use of `func' as an abbreviation for `function' (*note - Definition Syntax::). + Definition Syntax::) * The `fflush()' built-in function for flushing buffered output - (*note I/O Functions::). + (*note I/O Functions::) *Note Common Extensions::, for a full list of the extensions @@ -26402,104 +26510,108 @@ the current version of `gawk'. * Additional predefined variables: - - The `ARGIND' `BINMODE', `ERRNO', `FIELDWIDTHS', `FPAT', + - The `ARGIND', `BINMODE', `ERRNO', `FIELDWIDTHS', `FPAT', `IGNORECASE', `LINT', `PROCINFO', `RT', and `TEXTDOMAIN' - variables (*note Built-in Variables::). + variables (*note Built-in Variables::) * Special files in I/O redirections: - - The `/dev/stdin', `/dev/stdout', `/dev/stderr' and - `/dev/fd/N' special file names (*note Special Files::). + - The `/dev/stdin', `/dev/stdout', `/dev/stderr', and + `/dev/fd/N' special file names (*note Special Files::) - The `/inet', `/inet4', and `/inet6' special files for TCP/IP networking using `|&' to specify which version of the IP - protocol to use (*note TCP/IP Networking::). + protocol to use (*note TCP/IP Networking::) * Changes and/or additions to the language: - - The `\x' escape sequence (*note Escape Sequences::). + - The `\x' escape sequence (*note Escape Sequences::) - - Full support for both POSIX and GNU regexps (*note Regexp::). + - Full support for both POSIX and GNU regexps (*note Regexp::) - The ability for `FS' and for the third argument to `split()' - to be null strings (*note Single Character Fields::). + to be null strings (*note Single Character Fields::) - - The ability for `RS' to be a regexp (*note Records::). + - The ability for `RS' to be a regexp (*note Records::) - The ability to use octal and hexadecimal constants in `awk' - program source code (*note Nondecimal-numbers::). + program source code (*note Nondecimal-numbers::) - The `|&' operator for two-way I/O to a coprocess (*note - Two-way I/O::). + Two-way I/O::) - - Indirect function calls (*note Indirect Calls::). + - Indirect function calls (*note Indirect Calls::) - Directories on the command line produce a warning and are - skipped (*note Command-line directories::). + skipped (*note Command-line directories::) + + - Output with `print' and `printf' need not be fatal (*note + Nonfatal::) * New keywords: - The `BEGINFILE' and `ENDFILE' special patterns (*note - BEGINFILE/ENDFILE::). + BEGINFILE/ENDFILE::) - - The `switch' statement (*note Switch Statement::). + - The `switch' statement (*note Switch Statement::) * Changes to standard `awk' functions: - The optional second argument to `close()' that allows closing - one end of a two-way pipe to a coprocess (*note Two-way - I/O::). + one end of a two-way pipe to a coprocess (*note Two-way I/O::) - - POSIX compliance for `gsub()' and `sub()' with `--posix'. + - POSIX compliance for `gsub()' and `sub()' with `--posix' - The `length()' function accepts an array argument and returns - the number of elements in the array (*note String - Functions::). + the number of elements in the array (*note String Functions::) - The optional third argument to the `match()' function for capturing text-matching subexpressions within a regexp (*note - String Functions::). + String Functions::) - Positional specifiers in `printf' formats for making - translations easier (*note Printf Ordering::). + translations easier (*note Printf Ordering::) - - The `split()' function's additional optional fourth argument + - The `split()' function's additional optional fourth argument, which is an array to hold the text of the field separators - (*note String Functions::). + (*note String Functions::) * Additional functions only in `gawk': - The `gensub()', `patsplit()', and `strtonum()' functions for - more powerful text manipulation (*note String Functions::). + more powerful text manipulation (*note String Functions::) - The `asort()' and `asorti()' functions for sorting arrays - (*note Array Sorting::). + (*note Array Sorting::) - The `mktime()', `systime()', and `strftime()' functions for - working with timestamps (*note Time Functions::). + working with timestamps (*note Time Functions::) - The `and()', `compl()', `lshift()', `or()', `rshift()', and `xor()' functions for bit manipulation (*note Bitwise - Functions::). + Functions::) - The `isarray()' function to check if a variable is an array - or not (*note Type Functions::). + or not (*note Type Functions::) - - The `bindtextdomain()', `dcgettext()' and `dcngettext()' - functions for internationalization (*note Programmer i18n::). + - The `bindtextdomain()', `dcgettext()', and `dcngettext()' + functions for internationalization (*note Programmer i18n::) + + - The `div()' function for doing integer division and remainder + (*note Numeric Functions::) * Changes and/or additions in the command-line options: - The `AWKPATH' environment variable for specifying a path - search for the `-f' command-line option (*note Options::). + search for the `-f' command-line option (*note Options::) - The `AWKLIBPATH' environment variable for specifying a path - search for the `-l' command-line option (*note Options::). + search for the `-l' command-line option (*note Options::) - The `-b', `-c', `-C', `-d', `-D', `-e', `-E', `-g', `-h', `-i', `-l', `-L', `-M', `-n', `-N', `-o', `-O', `-p', `-P', `-r', `-S', `-t', and `-V' short options. Also, the ability - to use GNU-style long-named options that start with `--' and + to use GNU-style long-named options that start with `--', and the `--assign', `--bignum', `--characters-as-bytes', `--copyright', `--debug', `--dump-variables', `--exec', `--field-separator', `--file', `--gen-pot', `--help', @@ -26537,12 +26649,15 @@ the current version of `gawk'. - GCC for VAX and Alpha has not been tested for a while. - * Support for the following obsolete systems was removed from the - code for `gawk' version 4.1: + * Support for the following obsolete system was removed from the code + for `gawk' version 4.1: - Ultrix - * Support for MirBSD was removed at `gawk' version 4.2. + * Support for the following systems was removed from the code for + `gawk' version 4.2: + + - MirBSD @@ -26929,6 +27044,28 @@ in POSIX `awk', in the order they were added to `gawk'. * The dynamic extension interface was completely redone (*note Dynamic Extensions::). + * Support for Ultrix was removed. + + + Version 4.2 introduced the following changes: + + * Changes to `ENVIRON' are reflected into `gawk''s environment and + that of programs that it runs. *Note Auto-set::. + + * The `--pretty-print' option no longer runs the `awk' program too. + *Note Options::. + + * The `igawk' program and its manual page are no longer installed + when `gawk' is built. *Note Igawk Program::. + + * The `div()' function. *Note Numeric Functions::. + + * The maximum number of hexdecimal digits in `\x' escapes is now two. + *Note Escape Sequences::. + + * Nonfatal output with `print' and `printf'. *Note Nonfatal::. + + * Support for MirBSD was removed. File: gawk.info, Node: Common Extensions, Next: Ranges and Locales, Prev: Feature History, Up: Language History @@ -26940,22 +27077,22 @@ The following table summarizes the common extensions supported by `gawk', Brian Kernighan's `awk', and `mawk', the three most widely used freely available versions of `awk' (*note Other Versions::). -Feature BWK Awk Mawk GNU Awk Now standard ------------------------------------------------------------------------ -`\x' Escape sequence X X X -`FS' as null string X X X -`/dev/stdin' special file X X X -`/dev/stdout' special file X X X -`/dev/stderr' special file X X X -`delete' without subscript X X X X -`fflush()' function X X X X -`length()' of an array X X X -`nextfile' statement X X X X -`**' and `**=' operators X X -`func' keyword X X -`BINMODE' variable X X -`RS' as regexp X X -Time-related functions X X +Feature BWK `awk' `mawk' `gawk' Now standard +-------------------------------------------------------------------------- +`\x' escape sequence X X X +`FS' as null string X X X +`/dev/stdin' special file X X X +`/dev/stdout' special file X X X +`/dev/stderr' special file X X X +`delete' without subscript X X X X +`fflush()' function X X X X +`length()' of an array X X X +`nextfile' statement X X X X +`**' and `**=' operators X X +`func' keyword X X +`BINMODE' variable X X +`RS' as regexp X X +Time-related functions X X File: gawk.info, Node: Ranges and Locales, Next: Contributors, Prev: Common Extensions, Up: Language History @@ -26975,7 +27112,7 @@ in the machine's native character set. Thus, on ASCII-based systems, `[a-z]' matched all the lowercase letters, and only the lowercase letters, as the numeric values for the letters from `a' through `z' were contiguous. (On an EBCDIC system, the range `[a-z]' includes -additional, non-alphabetic characters as well.) +additional nonalphabetic characters as well.) Almost all introductory Unix literature explained range expressions as working in this fashion, and in particular, would teach that the @@ -26999,7 +27136,7 @@ outside those locales, the ordering was defined to be based on What does that mean? In many locales, `A' and `a' are both less than `B'. In other words, these locales sort characters in dictionary order, and `[a-dx-z]' is typically not equivalent to `[abcdxyz]'; -instead it might be equivalent to `[ABCXYabcdxyz]', for example. +instead, it might be equivalent to `[ABCXYabcdxyz]', for example. This point needs to be emphasized: much literature teaches that you should use `[a-z]' to match a lowercase character. But on systems with @@ -27023,17 +27160,17 @@ is perfectly valid in ASCII, but is not valid in many Unicode locales, such as `en_US.UTF-8'. Early versions of `gawk' used regexp matching code that was not -locale aware, so ranges had their traditional interpretation. +locale-aware, so ranges had their traditional interpretation. When `gawk' switched to using locale-aware regexp matchers, the problems began; especially as both GNU/Linux and commercial Unix vendors started implementing non-ASCII locales, _and making them the default_. Perhaps the most frequently asked question became something -like "why does `[A-Z]' match lowercase letters?!?" +like, "Why does `[A-Z]' match lowercase letters?!?" This situation existed for close to 10 years, if not more, and the `gawk' maintainer grew weary of trying to explain that `gawk' was being -nicely standards compliant, and that the issue was in the user's +nicely standards-compliant, and that the issue was in the user's locale. During the development of version 4.0, he modified `gawk' to always treat ranges in the original, pre-POSIX fashion, unless `--posix' was used (*note Options::).(2) @@ -27045,18 +27182,18 @@ of range expressions was _undefined_.(3) By using this lovely technical term, the standard gives license to implementors to implement ranges in whatever way they choose. The -`gawk' maintainer chose to apply the pre-POSIX meaning in all cases: -the default regexp matching; with `--traditional' and with `--posix'; -in all cases, `gawk' remains POSIX compliant. +`gawk' maintainer chose to apply the pre-POSIX meaning both with the +default regexp matching and when `--traditional' or `--posix' are used. +In all cases `gawk' remains POSIX-compliant. ---------- Footnotes ---------- (1) And Life was good. (2) And thus was born the Campaign for Rational Range Interpretation -(or RRI). A number of GNU tools have either implemented this change, or -will soon. Thanks to Karl Berry for coining the phrase "Rational Range -Interpretation." +(or RRI). A number of GNU tools have already implemented this change, +or will soon. Thanks to Karl Berry for coining the phrase "Rational +Range Interpretation." (3) See the standard (http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_05) @@ -27088,7 +27225,7 @@ Info file, in approximate chronological order: * Richard Stallman helped finish the implementation and the initial draft of this Info file. He is also the founder of the FSF and - the GNU project. + the GNU Project. * John Woods contributed parts of the code (mostly fixes) in the initial version of `gawk'. @@ -27174,22 +27311,22 @@ Info file, in approximate chronological order: * John Haque made the following contributions: - The modifications to convert `gawk' into a byte-code - interpreter, including the debugger. + interpreter, including the debugger - - The addition of true arrays of arrays. + - The addition of true arrays of arrays - The additional modifications for support of - arbitrary-precision arithmetic. + arbitrary-precision arithmetic - - The initial text of *note Arbitrary Precision Arithmetic::. + - The initial text of *note Arbitrary Precision Arithmetic:: - The work to merge the three versions of `gawk' into one, for - the 4.1 release. + the 4.1 release - - Improved array internals for arrays indexed by integers. + - Improved array internals for arrays indexed by integers - - The improved array sorting features were driven by John - together with Pat Rankin. + - The improved array sorting features were also driven by John, + together with Pat Rankin * Panos Papadopoulos contributed the original text for *note Include Files::. @@ -27218,11 +27355,11 @@ A.10 Summary ============ * The `awk' language has evolved over time. The first release was - with V7 Unix circa 1978. In 1987, for System V Release 3.1, major - additions, including user-defined functions, were made to the - language. Additional changes were made for System V Release 4, in - 1989. Since then, further minor changes happen under the auspices - of the POSIX standard. + with V7 Unix, circa 1978. In 1987, for System V Release 3.1, + major additions, including user-defined functions, were made to + the language. Additional changes were made for System V Release + 4, in 1989. Since then, further minor changes have happened under + the auspices of the POSIX standard. * Brian Kernighan's `awk' provides a small number of extensions that are implemented in common with other versions of `awk'. @@ -27235,7 +27372,7 @@ A.10 Summary been confusing over the years. Today, `gawk' implements Rational Range Interpretation, where ranges of the form `[a-z]' match _only_ the characters numerically between `a' through `z' in the - machine's native character set. Usually this is ASCII but it can + machine's native character set. Usually this is ASCII, but it can be EBCDIC on IBM S/390 systems. * Many people have contributed to `gawk' development over the years. @@ -27313,7 +27450,7 @@ B.1.2 Extracting the Distribution `gawk' is distributed as several `tar' files compressed with different compression programs: `gzip', `bzip2', and `xz'. For simplicity, the rest of these instructions assume you are using the one compressed with -the GNU Zip program, `gzip'. +the GNU Gzip program (`gzip'). Once you have the distribution (e.g., `gawk-4.1.2.tar.gz'), use `gzip' to expand the file and then use `tar' to extract it. You can @@ -27353,10 +27490,10 @@ files, subdirectories, and files related to the configuration process to different non-Unix operating systems: Various `.c', `.y', and `.h' files - The actual `gawk' source code. + These files contain the actual `gawk' source code. `ABOUT-NLS' - Information about GNU `gettext' and translations. + A file containing information about GNU `gettext' and translations. `AUTHORS' A file with some information about the authorship of `gawk'. It @@ -27388,7 +27525,7 @@ Various `.c', `.y', and `.h' files The GNU General Public License. `POSIX.STD' - A description of behaviors in the POSIX standard for `awk' which + A description of behaviors in the POSIX standard for `awk' that are left undefined, or where `gawk' may not comply fully, as well as a list of things that the POSIX standard should describe but does not. @@ -27652,14 +27789,16 @@ command line when compiling `gawk' from scratch, including: do nothing. Similarly, setting the `LINT' variable (*note User-modified::) has no effect on the running `awk' program. - When used with GCC's automatic dead-code-elimination, this option - cuts almost 23K bytes off the size of the `gawk' executable on - GNU/Linux x86_64 systems. Results on other systems and with other - compilers are likely to vary. Using this option may bring you - some slight performance improvement. + When used with the GNU Compiler Collection's (GCC's) automatic + dead-code-elimination, this option cuts almost 23K bytes off the + size of the `gawk' executable on GNU/Linux x86_64 systems. + Results on other systems and with other compilers are likely to + vary. Using this option may bring you some slight performance + improvement. - Using this option will cause some of the tests in the test suite - to fail. This option may be removed at a later date. + CAUTION: Using this option will cause some of the tests in + the test suite to fail. This option may be removed at a + later date. `--disable-nls' Disable all message-translation facilities. This is usually not @@ -27743,10 +27882,10 @@ B.3.1 Installation on PC Operating Systems This minor node covers installation and usage of `gawk' on Intel architecture machines running MS-DOS, any version of MS-Windows, or OS/2. In this minor node, the term "Windows32" refers to any of -Microsoft Windows-95/98/ME/NT/2000/XP/Vista/7/8. +Microsoft Windows 95/98/ME/NT/2000/XP/Vista/7/8. The limitations of MS-DOS (and MS-DOS shells under the other -operating systems) has meant that various "DOS extenders" are often +operating systems) have meant that various "DOS extenders" are often used with programs such as `gawk'. The varying capabilities of Microsoft Windows 3.1 and Windows32 can add to the confusion. For an overview of the considerations, refer to `README_d/README.pc' in the @@ -27941,7 +28080,7 @@ The DJGPP collection of tools includes an MS-DOS port of Bash, and several shells are available for OS/2, including `ksh'. Under MS-Windows, OS/2 and MS-DOS, `gawk' (and many other text -programs) silently translate end-of-line `\r\n' to `\n' on input and +programs) silently translates end-of-line `\r\n' to `\n' on input and `\n' to `\r\n' on output. A special `BINMODE' variable (c.e.) allows control over these translations and is interpreted as follows: @@ -27963,7 +28102,7 @@ The modes for standard input and standard output are set one time only program). Setting `BINMODE' for standard input or standard output is accomplished by using an appropriate `-v BINMODE=N' option on the command line. `BINMODE' is set at the time a file or pipe is opened -and cannot be changed mid-stream. +and cannot be changed midstream. The name `BINMODE' was chosen to match `mawk' (*note Other Versions::). `mawk' and `gawk' handle `BINMODE' similarly; however, @@ -28007,10 +28146,9 @@ B.3.1.5 Using `gawk' In The Cygwin Environment `gawk' can be built and used "out of the box" under MS-Windows if you are using the Cygwin environment (http://www.cygwin.com). This -environment provides an excellent simulation of GNU/Linux, using the -GNU tools, such as Bash, the GNU Compiler Collection (GCC), GNU Make, -and other GNU programs. Compilation and installation for Cygwin is the -same as for a Unix system: +environment provides an excellent simulation of GNU/Linux, using Bash, +GCC, GNU Make, and other GNU programs. Compilation and installation +for Cygwin is the same as for a Unix system: tar -xvpzf gawk-4.1.2.tar.gz cd gawk-4.1.2 @@ -28028,7 +28166,7 @@ B.3.1.6 Using `gawk' In The MSYS Environment ............................................ In the MSYS environment under MS-Windows, `gawk' automatically uses -binary mode for reading and writing files. Thus there is no need to +binary mode for reading and writing files. Thus, there is no need to use the `BINMODE' variable. This can cause problems with other Unix-like components that have @@ -28083,9 +28221,9 @@ available from `https://github.com/endlesssoftware/mmk'. target parameter may need to be exact. `gawk' has been tested under VAX/VMS 7.3 and Alpha/VMS 7.3-1 using -Compaq C V6.4, and Alpha/VMS 7.3, Alpha/VMS 7.3-2, and IA64/VMS 8.3. -The most recent builds used HP C V7.3 on Alpha VMS 8.3 and both Alpha -and IA64 VMS 8.4 used HP C 7.3.(1) +Compaq C V6.4, and under Alpha/VMS 7.3, Alpha/VMS 7.3-2, and IA64/VMS +8.3. The most recent builds used HP C V7.3 on Alpha VMS 8.3 and both +Alpha and IA64 VMS 8.4 used HP C 7.3.(1) *Note VMS GNV::, for information on building `gawk' as a PCSI kit that is compatible with the GNV product. @@ -28128,7 +28266,7 @@ than 32 bits. /name=(as_is,short) - Compile time macros need to be defined before the first VMS-supplied + Compile-time macros need to be defined before the first VMS-supplied header file is included, as follows: #if (__CRTL_VER >= 70200000) && !defined (__VAX) @@ -28172,14 +28310,14 @@ directory tree, the program will be known as `GNV$GNU:[vms_help]gawk.hlp'. The PCSI kit also installs a `GNV$GNU:[vms_bin]gawk_verb.cld' file -which can be used to add `gawk' and `awk' as DCL commands. +that can be used to add `gawk' and `awk' as DCL commands. For just the current process you can use: $ set command gnv$gnu:[vms_bin]gawk_verb.cld Or the system manager can use `GNV$GNU:[vms_bin]gawk_verb.cld' to -add the `gawk' and `awk' to the system wide `DCLTABLES'. +add the `gawk' and `awk' to the system-wide `DCLTABLES'. The DCL syntax is documented in the `gawk.hlp' file. @@ -28239,14 +28377,14 @@ process) are present, there is no ambiguity and `--' can be omitted. status value when the program exits. The VMS severity bits will be set based on the `exit' value. A -failure is indicated by 1 and VMS sets the `ERROR' status. A fatal -error is indicated by 2 and VMS sets the `FATAL' status. All other +failure is indicated by 1, and VMS sets the `ERROR' status. A fatal +error is indicated by 2, and VMS sets the `FATAL' status. All other values will have the `SUCCESS' status. The exit value is encoded to comply with VMS coding standards and will have the `C_FACILITY_NO' of `0x350000' with the constant `0xA000' added to the number shifted over by 3 bits to make room for the severity codes. - To extract the actual `gawk' exit code from the VMS status use: + To extract the actual `gawk' exit code from the VMS status, use: unix_status = (vms_status .and. &x7f8) / 8 @@ -28262,7 +28400,7 @@ Function::. VMS reports time values in GMT unless one of the `SYS$TIMEZONE_RULE' or `TZ' logical names is set. Older versions of VMS, such as VAX/VMS -7.3 do not set these logical names. +7.3, do not set these logical names. The default search path, when looking for `awk' program files specified by the `-f' option, is `"SYS$DISK:[],AWK_LIBRARY:"'. The @@ -28279,7 +28417,7 @@ B.3.2.5 The VMS GNV Project The VMS GNV package provides a build environment similar to POSIX with ports of a collection of open source tools. The `gawk' found in the GNV -base kit is an older port. Currently the GNV project is being +base kit is an older port. Currently, the GNV project is being reorganized to supply individual PCSI packages for each component. See `https://sourceforge.net/p/gnv/wiki/InstallingGNVPackages/'. @@ -28313,7 +28451,7 @@ B.4 Reporting Problems and Bugs Douglas Adams, `The Hitchhiker's Guide to the Galaxy' If you have problems with `gawk' or think that you have found a bug, -report it to the developers; we cannot promise to do anything but we +report it to the developers; we cannot promise to do anything, but we might well want to fix it. Before reporting a bug, make sure you have really found a genuine @@ -28324,7 +28462,7 @@ documentation! Before reporting a bug or trying to fix it yourself, try to isolate it to the smallest possible `awk' program and input data file that -reproduces the problem. Then send us the program and data file, some +reproduce the problem. Then send us the program and data file, some idea of what kind of Unix system you're using, the compiler you used to compile `gawk', and the exact results `gawk' gave you. Also say what you expected to occur; this helps us decide whether the problem is @@ -28336,7 +28474,7 @@ You can get this information with the command `gawk --version'. Once you have a precise problem description, send email to <bug-gawk@gnu.org>. - The `gawk' maintainers subscribe to this address and thus they will + The `gawk' maintainers subscribe to this address, and thus they will receive your bug report. Although you can send mail to the maintainers directly, the bug reporting address is preferred because the email list is archived at the GNU Project. _All email must be in English. This is @@ -28357,8 +28495,8 @@ the only language understood in common by all the maintainers._ forward bug reports "upstream" to the GNU mailing list, many don't, so there is a good chance that the `gawk' maintainers won't even see the bug report! Second, mail to the GNU list is - archived, and having everything at the GNU project keeps things - self-contained and not dependant on other organizations. + archived, and having everything at the GNU Project keeps things + self-contained and not dependent on other organizations. Non-bug suggestions are always welcome as well. If you have questions about things that are unclear in the documentation or are @@ -28367,18 +28505,19 @@ if we can. If you find bugs in one of the non-Unix ports of `gawk', send an email to the bug list, with a copy to the person who maintains that -port. They are named in the following list, as well as in the `README' -file in the `gawk' distribution. Information in the `README' file -should be considered authoritative if it conflicts with this Info file. +port. The maintainers are named in the following list, as well as in +the `README' file in the `gawk' distribution. Information in the +`README' file should be considered authoritative if it conflicts with +this Info file. The people maintaining the various `gawk' ports are: -Unix and POSIX systems Arnold Robbins, <arnold@skeeve.com>. -MS-DOS with DJGPP Scott Deifik, <scottd.mail@sbcglobal.net>. -MS-Windows with MinGW Eli Zaretskii, <eliz@gnu.org>. -OS/2 Andreas Buening, <andreas.buening@nexgo.de>. -VMS John Malmberg, <wb8tyw@qsl.net>. -z/OS (OS/390) Dave Pitts, <dpitts@cozx.com>. +Unix and POSIX systems Arnold Robbins, <arnold@skeeve.com> +MS-DOS with DJGPP Scott Deifik, <scottd.mail@sbcglobal.net> +MS-Windows with MinGW Eli Zaretskii, <eliz@gnu.org> +OS/2 Andreas Buening, <andreas.buening@nexgo.de> +VMS John Malmberg, <wb8tyw@qsl.net> +z/OS (OS/390) Dave Pitts, <dpitts@cozx.com> If your bug is also reproducible under Unix, send a copy of your report to the <bug-gawk@gnu.org> email list as well. @@ -28389,7 +28528,7 @@ File: gawk.info, Node: Other Versions, Next: Installation summary, Prev: Bugs B.5 Other Freely Available `awk' Implementations ================================================ - It's kind of fun to put comments like this in your awk code. + It's kind of fun to put comments like this in your awk code: `// Do C++ comments work? answer: yes! of course' -- Michael Brennan @@ -28412,11 +28551,11 @@ Unix `awk' Zip file `http://www.cs.princeton.edu/~bwk/btl.mirror/awk.zip' - You can also retrieve it from Git Hub: + You can also retrieve it from GitHub: git clone git://github.com/onetrueawk/awk bwkawk - This command creates a copy of the Git (http://www.git-scm.com) + This command creates a copy of the Git (http://git-scm.com) repository in a directory named `bwkawk'. If you leave that argument off the `git' command line, the repository copy is created in a directory named `awk'. @@ -28454,7 +28593,7 @@ Unix `awk' `awka' Written by Andrew Sumner, `awka' translates `awk' programs into C, compiles them, and links them with a library of functions that - provides the core `awk' functionality. It also has a number of + provide the core `awk' functionality. It also has a number of extensions. The `awk' translator is released under the GPL, and the library is @@ -28463,19 +28602,19 @@ Unix `awk' To get `awka', go to `http://sourceforge.net/projects/awka'. The project seems to be frozen; no new code changes have been made - since approximately 2003. + since approximately 2001. `pawk' Nelson H.F. Beebe at the University of Utah has modified BWK `awk' to provide timing and profiling information. It is different from - `gawk' with the `--profile' option (*note Profiling::), in that it + `gawk' with the `--profile' option (*note Profiling::) in that it uses CPU-based profiling, not line-count profiling. You may find it at either `ftp://ftp.math.utah.edu/pub/pawk/pawk-20030606.tar.gz' or `http://www.math.utah.edu/pub/pawk/pawk-20030606.tar.gz'. -Busybox Awk - Busybox is a GPL-licensed program providing small versions of many +BusyBox `awk' + BusyBox is a GPL-licensed program providing small versions of many applications within a single executable. It is aimed at embedded systems. It includes a full implementation of POSIX `awk'. When building it, be careful not to do `make install' as it will @@ -28485,7 +28624,7 @@ Busybox Awk The OpenSolaris POSIX `awk' The versions of `awk' in `/usr/xpg4/bin' and `/usr/xpg6/bin' on - Solaris are more-or-less POSIX-compliant. They are based on the + Solaris are more or less POSIX-compliant. They are based on the `awk' from Mortice Kern Systems for PCs. We were able to make this code compile and work under GNU/Linux with 1-2 hours of work. Making it more generally portable (using GNU Autoconf and/or @@ -28517,7 +28656,7 @@ Libmawk information. (This is not related to Nelson Beebe's modified version of BWK `awk', described earlier.) -QSE Awk +QSE `awk' This is an embeddable `awk' interpreter. For more information, see `http://code.google.com/p/qse/' and `http://awk.info/?tools/qse'. @@ -28535,7 +28674,7 @@ Other versions See also the "Versions and implementations" section of the Wikipedia article (http://en.wikipedia.org/wiki/Awk_language#Versions_and_implementations) - for information on additional versions. + on `awk' for information on additional versions. @@ -28544,7 +28683,7 @@ File: gawk.info, Node: Installation summary, Prev: Other Versions, Up: Instal B.6 Summary =========== - * The `gawk' distribution is available from GNU project's main + * The `gawk' distribution is available from the GNU Project's main distribution site, `ftp.gnu.org'. The canonical build recipe is: wget http://ftp.gnu.org/gnu/gawk/gawk-4.1.2.tar.gz @@ -28553,17 +28692,17 @@ B.6 Summary ./configure && make && make check * `gawk' may be built on non-POSIX systems as well. The currently - supported systems are MS-Windows using DJGPP, MSYS, MinGW and + supported systems are MS-Windows using DJGPP, MSYS, MinGW, and Cygwin, OS/2 using EMX, and both Vax/VMS and OpenVMS. Instructions for each system are included in this major node. * Bug reports should be sent via email to <bug-gawk@gnu.org>. Bug - reports should be in English, and should include the version of + reports should be in English and should include the version of `gawk', how it was compiled, and a short program and data file - which demonstrate the problem. + that demonstrate the problem. * There are a number of other freely available `awk' - implementations. Many are POSIX compliant; others are less so. + implementations. Many are POSIX-compliant; others are less so. @@ -28648,7 +28787,7 @@ released versions of `gawk'. changes, you will probably wish to work with the development version. To do so, you will need to access the `gawk' source code repository. The code is maintained using the Git distributed version control system -(http://git-scm.com/). You will need to install it if your system +(http://git-scm.com). You will need to install it if your system doesn't have it. Once you have done so, use the command: git clone git://git.savannah.gnu.org/gawk.git @@ -28703,7 +28842,7 @@ possible to include them: document describes how GNU software should be written. If you haven't read it, please do so, preferably _before_ starting to modify `gawk'. (The `GNU Coding Standards' are available from the - GNU Project's website (http://www.gnu.org/prep/standards_toc.html). + GNU Project's website (http://www.gnu.org/prep/standards/). Texinfo, Info, and DVI versions are also available.) 5. Use the `gawk' coding style. The C code for `gawk' follows the @@ -29585,6 +29724,21 @@ ANSI C++ programming languages. These standards often become international standards as well. See also "ISO." +Argument + An argument can be two different things. It can be an option or a + file name passed to a command while invoking it from the command + line, or it can be something passed to a "function" inside a + program, e.g. inside `awk'. + + In the latter case, an argument can be passed to a function in two + ways. Either it is given to the called function by value, i.e., a + copy of the value of the variable is made available to the called + function, but the original variable cannot be modified by the + function itself; or it is given by reference, i.e., a pointer to + the interested variable is passed to the function, which can then + directly modify it. In `awk' scalars are passed by value, and + arrays are passed by reference. See "Pass By Value/Reference." + Array A grouping of multiple values under the same name. Most languages just provide sequential arrays. `awk' provides associative arrays. @@ -29620,6 +29774,26 @@ Bash The GNU version of the standard shell (the Bourne-Again SHell). See also "Bourne Shell." +Binary + Base-two notation, where the digits are `0'-`1'. Since electronic + circuitry works "naturally" in base 2 (just think of Off/On), + everything inside a computer is calculated using base 2. Each digit + represents the presence (or absence) of a power of 2 and is called + a "bit". So, for example, the base-two number `10101' is the same + as decimal 21, ((1 x 16) + (1 x 4) + (1 x 1)). + + Since base-two numbers quickly become very long to read and write, + they are usually grouped by 3 (i.e., they are read as octal + numbers), or by 4 (i.e., they are read as hexadecimal numbers). + There is no direct way to insert base 2 numbers in a C program. + If need arises, such numbers are usually inserted as octal or + hexadecimal numbers. The number of base-two digits that fit into + registers used for representing integer numbers in computers is a + rough indication of the computing power of the computer itself. + Most computers nowadays use 64 bits for representing integer + numbers in their registers, but 32-bit, 16-bit and 8-bit registers + have been widely used in the past. *Note Nondecimal-numbers::. + Bit Short for "Binary Digit." All values in computer memory ultimately reduce to binary digits: values that are either zero or @@ -29648,6 +29822,19 @@ Braces The characters `{' and `}'. Braces are used in `awk' for delimiting actions, compound statements, and function bodies. +Bracket Expression + Inside a "regular expression", an expression included in square + brackets, meant to designate a single character as belonging to a + specified character class. A bracket expression can contain a list + of one or more characters, like `[abc]', a range of characters, + like `[A-Z]', or a name, delimited by `:', that designates a known + set of characters, like `[:digit:]'. The form of bracket expression + enclosed between `:' is independent of the underlying + representation of the character themselves, which could utilize + the ASCII, ECBDIC, or Unicode codesets, depending on the + architecture of the computer system, and on localization. See + also "Regular Expression." + Built-in Function The `awk' language provides built-in functions that perform various numerical, I/O-related, and string computations. Examples are @@ -29675,9 +29862,25 @@ C In general, `gawk' attempts to be as similar to the 1990 version of ISO C as makes sense. +C Shell + The C Shell (`csh' or its improved version, `tcsh') is a Unix + shell that was created by Bill Joy in the late 1970s. The C shell + was differentiated from other shells by its interactive features + and overall style, which looks more like C. The C Shell is not + backward compatible with the Bourne Shell, so special attention is + required when converting scripts written for other Unix shells to + the C shell, especially with regard to the management of shell + variables. See also "Bourne Shell." + C++ A popular object-oriented programming language derived from C. +Character Class + See "Bracket Expression." + +Character List + See "Bracket Expression." + Character Set The set of numeric codes used by a computer system to represent the characters (letters, numbers, punctuation, etc.) of a particular @@ -29692,7 +29895,7 @@ CHEM A preprocessor for `pic' that reads descriptions of molecules and produces `pic' input for drawing them. It was written in `awk' by Brian Kernighan and Jon Bentley, and is available from - `http://netlib.sandia.gov/netlib/typesetting/chem.gz'. + `http://netlib.org/typesetting/chem'. Comparison Expression A relation that is either true or false, such as `a < b'. @@ -29705,10 +29908,21 @@ Compiler machine-executable object code. The object code is then executed directly by the computer. See also "Interpreter." +Complemented Bracket Expression + The negation of a "bracket expression". All that is _not_ + described by a given bracket expression. The symbol `^' precedes + the negated bracket expression. E.g.: `[[^:digit:]' designates + whatever character is not a digit. `[^bad]' designates whatever + character is not one of the letters `b', `a', or `d'. See + "Bracket Expression." + Compound Statement A series of `awk' statements, enclosed in curly braces. Compound statements may be nested. (*Note Statements::.) +Computed Regexps + See "Dynamic Regular Expressions." + Concatenation Concatenating two strings means sticking them together, one after another, producing a new string. For example, the string `foo' @@ -29722,6 +29936,12 @@ Conditional Expression otherwise the value is EXPR3. In either case, only one of EXPR2 and EXPR3 is evaluated. (*Note Conditional Exp::.) +Control Statement + A control statement is an instruction to perform a given operation + or a set of operations inside an `awk' program, if a given + condition is true. Control statements are: `if', `for', `while', + and `do' (*note Statements::). + Cookie A peculiar goodie, token, saying or remembrance produced by or presented to a program. (With thanks to Professor Doug McIlroy.) @@ -29828,6 +30048,12 @@ Format are controlled by the format strings contained in the predefined variables `CONVFMT' and `OFMT'. (*Note Control Letters::.) +Fortran + Shorthand for FORmula TRANslator, one of the first programming + languages available for scientific calculations. It was created by + John Backus, and has been available since 1957. It is still in use + today. + Free Documentation License This document describes the terms under which this Info file is published and may be copied. (*Note GNU Free Documentation @@ -29843,9 +30069,16 @@ FSF See "Free Software Foundation." Function - A specialized group of statements used to encapsulate general or - program-specific tasks. `awk' has a number of built-in functions, - and also allows you to define your own. (*Note Functions::.) + A part of an `awk' program that can be invoked from every point of + the program, to perform a task. `awk' has several built-in + functions. Users can define their own functions in every part of + the program. Function can be recursive, i.e., they may invoke + themselves. *Note Functions::. In `gawk' it is also possible to + have functions shared among different programs, and included where + required using the `@include' directive (*note Include Files::). + In `gawk' the name of the function that should be invoked can be + generated at run time, i.e., dynamically. The `gawk' extension + API provides constructor functions (*note Constructor Functions::). `gawk' The GNU implementation of `awk'. @@ -29941,6 +30174,12 @@ Keyword `else', `exit', `for...in', `for', `function', `func', `if', `next', `nextfile', `switch', and `while'. +Korn Shell + The Korn Shell (`ksh') is a Unix shell which was developed by + David Korn at Bell Laboratories in the early 1980s. The Korn Shell + is backward-compatible with the Bourne shell and includes many + features of the C shell. See also "Bourne Shell." + Lesser General Public License This document describes the terms under which binary library archives or shared objects, and their source code may be @@ -29978,6 +30217,13 @@ Metacharacters Instead, they denote regular expression operations, such as repetition, grouping, or alternation. +Nesting + Nesting is where information is organized in layers, or where + objects contain other similar objects. In `gawk' the `@include' + directive can be nested. The "natural" nesting of arithmetic and + logical operations can be changed using parentheses (*note + Precedence::). + No-op An operation that does nothing. @@ -29997,6 +30243,11 @@ Octal are written in C using a leading `0', to indicate their base. Thus, `013' is 11 ((1 x 8) + 3). *Note Nondecimal-numbers::. +Output Record + A single chunk of data that is written out by `awk'. Usually, an + `awk' output record consists of one or more lines of text. *Note + Records::. + Pattern Patterns tell `awk' which input records are interesting to which rules. @@ -30012,6 +30263,9 @@ PEBKAC computer usage problems. (Problem Exists Between Keyboard And Chair.) +Plug-in + See "Extensions." + POSIX The name for a series of standards that specify a Portable Operating System interface. The "IX" denotes the Unix heritage of @@ -30035,6 +30289,9 @@ Range (of input lines) can specify ranges of input lines for `awk' to process or it can specify single lines. (*Note Pattern Overview::.) +Record + See "Input record" and "Output record." + Recursion When a function calls itself, either directly or indirectly. If this is clear, stop, and proceed to the next entry. Otherwise, @@ -30051,6 +30308,16 @@ Redirection using the `>', `>>', `|', and `|&' operators. (*Note Getline::, and *note Redirection::.) +Reference Counts + An internal mechanism in `gawk' to minimize the amount of memory + needed to store the value of string variables. If the value + assumed by a variable is used in more than one place, only one + copy of the value itself is kept, and the associated reference + count is increased when the same value is used by an additional + variable, and decresed when the related variable is no longer in + use. When the reference count goes to zero, the memory space used + to store the value of the variable is freed. + Regexp See "Regular Expression." @@ -30069,6 +30336,15 @@ Regular Expression Constant when you write the `awk' program and cannot be changed during its execution. (*Note Regexp Usage::.) +Regular Expression Operators + See "Metacharacters." + +Rounding + Rounding the result of an arithmetic operation can be tricky. + More than one way of rounding exists, and in `gawk' it is possible + to choose which method should be used in a program. *Note Setting + the rounding mode::. + Rule A segment of an `awk' program that specifies how to process single input records. A rule consists of a "pattern" and an "action". @@ -30130,6 +30406,11 @@ Special File handed directly to the underlying operating system--for example, `/dev/stderr'. (*Note Special Files::.) +Statement + An expression inside an `awk' program in the action part of a + pattern-action rule, or inside an `awk' function. A statement can + be a variable assignment, an array operation, a loop, etc. + Stream Editor A program that reads records from an input stream and processes them one or more at a time. This is in contrast with batch @@ -30172,10 +30453,15 @@ UTC reference time for day and date calculations. See also "Epoch" and "GMT." +Variable + A name for a value. In `awk', variables may be either scalars or + arrays. + Whitespace A sequence of space, TAB, or newline characters occurring inside an input record or a string. + File: gawk.info, Node: Copying, Next: GNU Free Documentation License, Prev: Glossary, Up: Top @@ -31408,7 +31694,7 @@ Index * ! (exclamation point), !~ operator <5>: Case-sensitivity. (line 26) * ! (exclamation point), !~ operator <6>: Computed Regexps. (line 6) * ! (exclamation point), !~ operator: Regexp Usage. (line 19) -* " (double quote), in regexp constants: Computed Regexps. (line 29) +* " (double quote), in regexp constants: Computed Regexps. (line 30) * " (double quote), in shell commands: Quoting. (line 54) * # (number sign), #! (executable scripts): Executable Scripts. (line 6) @@ -31437,7 +31723,7 @@ Index * * (asterisk), * operator, as regexp operator: Regexp Operators. (line 89) * * (asterisk), * operator, null strings, matching: String Functions. - (line 536) + (line 537) * * (asterisk), ** operator <1>: Precedence. (line 49) * * (asterisk), ** operator: Arithmetic Ops. (line 81) * * (asterisk), **= operator <1>: Precedence. (line 95) @@ -31468,7 +31754,7 @@ Index * --disable-lint configuration option: Additional Configuration Options. (line 15) * --disable-nls configuration option: Additional Configuration Options. - (line 30) + (line 32) * --dump-variables option: Options. (line 93) * --dump-variables option, using for library functions: Library Names. (line 45) @@ -31496,7 +31782,7 @@ Index * --re-interval option: Options. (line 279) * --sandbox option: Options. (line 286) * --sandbox option, disabling system() function: I/O Functions. - (line 128) + (line 129) * --sandbox option, input redirection with getline: Getline. (line 19) * --sandbox option, output redirection with print, printf: Redirection. (line 6) @@ -31506,7 +31792,7 @@ Index * --use-lc-numeric option: Options. (line 220) * --version option: Options. (line 300) * --with-whiny-user-strftime configuration option: Additional Configuration Options. - (line 35) + (line 37) * -b option: Options. (line 68) * -C option: Options. (line 88) * -c option: Options. (line 81) @@ -31542,7 +31828,7 @@ Index * -W option: Options. (line 46) * . (period), regexp operator: Regexp Operators. (line 44) * .gmo files: Explaining gettext. (line 42) -* .gmo files, specifying directory of <1>: Programmer i18n. (line 47) +* .gmo files, specifying directory of <1>: Programmer i18n. (line 48) * .gmo files, specifying directory of: Explaining gettext. (line 54) * .mo files, converting from .po: I18N Example. (line 64) * .po files <1>: Translator i18n. (line 6) @@ -31643,7 +31929,7 @@ Index * \ (backslash), in escape sequences: Escape Sequences. (line 6) * \ (backslash), in escape sequences, POSIX and: Escape Sequences. (line 108) -* \ (backslash), in regexp constants: Computed Regexps. (line 29) +* \ (backslash), in regexp constants: Computed Regexps. (line 30) * \ (backslash), in shell commands: Quoting. (line 48) * \ (backslash), regexp operator: Regexp Operators. (line 18) * ^ (caret), ^ operator: Precedence. (line 49) @@ -31737,7 +32023,7 @@ Index * arrays: Arrays. (line 6) * arrays of arrays: Arrays of Arrays. (line 6) * arrays, an example of using: Array Example. (line 6) -* arrays, and IGNORECASE variable: Array Intro. (line 94) +* arrays, and IGNORECASE variable: Array Intro. (line 100) * arrays, as parameters to functions: Pass By Value/Reference. (line 44) * arrays, associative: Array Intro. (line 50) @@ -31764,14 +32050,14 @@ Index (line 6) * arrays, sorting, and IGNORECASE variable: Array Sorting Functions. (line 83) -* arrays, sparse: Array Intro. (line 72) +* arrays, sparse: Array Intro. (line 76) * arrays, subscripts, uninitialized variables as: Uninitialized Subscripts. (line 6) * arrays, unassigned elements: Reference to Elements. (line 18) * artificial intelligence, gawk and: Distribution contents. (line 52) -* ASCII <1>: Glossary. (line 133) +* ASCII <1>: Glossary. (line 197) * ASCII: Ordinal Functions. (line 45) * asort <1>: Array Sorting Functions. (line 6) @@ -31798,7 +32084,7 @@ Index * asterisk (*), * operator, as regexp operator: Regexp Operators. (line 89) * asterisk (*), * operator, null strings, matching: String Functions. - (line 536) + (line 537) * asterisk (*), ** operator <1>: Precedence. (line 49) * asterisk (*), ** operator: Arithmetic Ops. (line 81) * asterisk (*), **= operator <1>: Precedence. (line 95) @@ -31912,7 +32198,7 @@ Index * backslash (\), in escape sequences: Escape Sequences. (line 6) * backslash (\), in escape sequences, POSIX and: Escape Sequences. (line 108) -* backslash (\), in regexp constants: Computed Regexps. (line 29) +* backslash (\), in regexp constants: Computed Regexps. (line 30) * backslash (\), in shell commands: Quoting. (line 48) * backslash (\), regexp operator: Regexp Operators. (line 18) * backtrace debugger command: Execution Stack. (line 13) @@ -31924,7 +32210,7 @@ Index * BEGIN pattern, and profiling: Profiling. (line 62) * BEGIN pattern, assert() user-defined function and: Assert Function. (line 83) -* BEGIN pattern, Boolean patterns and: Expression Patterns. (line 69) +* BEGIN pattern, Boolean patterns and: Expression Patterns. (line 70) * BEGIN pattern, exit statement and: Exit Statement. (line 12) * BEGIN pattern, getline and: Getline Notes. (line 19) * BEGIN pattern, headings, adding: Print Examples. (line 43) @@ -31941,14 +32227,14 @@ Index * BEGIN pattern, TEXTDOMAIN variable and: Programmer i18n. (line 60) * BEGINFILE pattern: BEGINFILE/ENDFILE. (line 6) * BEGINFILE pattern, Boolean patterns and: Expression Patterns. - (line 69) -* beginfile() user-defined function: Filetrans Function. (line 61) -* Bentley, Jon: Glossary. (line 143) + (line 70) +* beginfile() user-defined function: Filetrans Function. (line 62) +* Bentley, Jon: Glossary. (line 207) * Benzinger, Michael: Contributors. (line 97) * Berry, Karl <1>: Ranges and Locales. (line 74) * Berry, Karl: Acknowledgments. (line 33) * binary input/output: User-modified. (line 15) -* bindtextdomain <1>: Programmer i18n. (line 47) +* bindtextdomain <1>: Programmer i18n. (line 48) * bindtextdomain: I18N Functions. (line 12) * bindtextdomain() function (C library): Explaining gettext. (line 50) * bindtextdomain() function (gawk), portability and: I18N Portability. @@ -31967,7 +32253,7 @@ Index * body, in actions: Statements. (line 10) * body, in loops: While Statement. (line 14) * Boolean expressions: Boolean Ops. (line 6) -* Boolean expressions, as patterns: Expression Patterns. (line 38) +* Boolean expressions, as patterns: Expression Patterns. (line 39) * Boolean operators, See Boolean expressions: Boolean Ops. (line 6) * Bourne shell, quoting rules for: Quoting. (line 18) * braces ({}): Profiling. (line 142) @@ -32005,7 +32291,7 @@ Index * Brennan, Michael: Foreword3. (line 84) * Brian Kernighan's awk <1>: I/O Functions. (line 43) * Brian Kernighan's awk <2>: Gory Details. (line 19) -* Brian Kernighan's awk <3>: String Functions. (line 492) +* Brian Kernighan's awk <3>: String Functions. (line 493) * Brian Kernighan's awk <4>: Delete. (line 51) * Brian Kernighan's awk <5>: Nextfile Statement. (line 47) * Brian Kernighan's awk <6>: Continue Statement. (line 44) @@ -32025,14 +32311,14 @@ Index * Brink, Jeroen: DOS Quoting. (line 10) * Broder, Alan J.: Contributors. (line 88) * Brown, Martin: Contributors. (line 82) -* BSD-based operating systems: Glossary. (line 611) +* BSD-based operating systems: Glossary. (line 753) * bt debugger command (alias for backtrace): Execution Stack. (line 13) -* Buening, Andreas <1>: Bugs. (line 70) +* Buening, Andreas <1>: Bugs. (line 71) * Buening, Andreas <2>: Contributors. (line 92) * Buening, Andreas: Acknowledgments. (line 60) * buffering, input/output <1>: Two-way I/O. (line 52) -* buffering, input/output: I/O Functions. (line 140) -* buffering, interactive vs. noninteractive: I/O Functions. (line 75) +* buffering, input/output: I/O Functions. (line 141) +* buffering, interactive vs. noninteractive: I/O Functions. (line 76) * buffers, flushing: I/O Functions. (line 32) * buffers, operators for: GNU Regexp Operators. (line 48) @@ -32040,7 +32326,7 @@ Index * bug-gawk@gnu.org bug reporting address: Bugs. (line 30) * built-in functions: Functions. (line 6) * built-in functions, evaluation order: Calling Built-in. (line 30) -* Busybox Awk: Other Versions. (line 92) +* BusyBox Awk: Other Versions. (line 92) * c.e., See common extensions: Conventions. (line 51) * call by reference: Pass By Value/Reference. (line 44) @@ -32057,8 +32343,8 @@ Index * case keyword: Switch Statement. (line 6) * case sensitivity, and regexps: User-modified. (line 76) * case sensitivity, and string comparisons: User-modified. (line 76) -* case sensitivity, array indices and: Array Intro. (line 94) -* case sensitivity, converting case: String Functions. (line 522) +* case sensitivity, array indices and: Array Intro. (line 100) +* case sensitivity, converting case: String Functions. (line 523) * case sensitivity, example programs: Library Functions. (line 53) * case sensitivity, gawk: Case-sensitivity. (line 26) * case sensitivity, regexps and: Case-sensitivity. (line 6) @@ -32067,7 +32353,7 @@ Index (line 56) * character lists in regular expression: Bracket Expressions. (line 6) * character lists, See bracket expressions: Regexp Operators. (line 56) -* character sets (machine character encodings) <1>: Glossary. (line 133) +* character sets (machine character encodings) <1>: Glossary. (line 197) * character sets (machine character encodings): Ordinal Functions. (line 45) * character sets, See Also bracket expressions: Regexp Operators. @@ -32078,7 +32364,7 @@ Index * Chassell, Robert J.: Acknowledgments. (line 33) * chdir() extension function: Extension Sample File Functions. (line 12) -* chem utility: Glossary. (line 143) +* chem utility: Glossary. (line 207) * chr() extension function: Extension Sample Ord. (line 15) * chr() user-defined function: Ordinal Functions. (line 16) @@ -32136,7 +32422,7 @@ Index * common extensions, \x escape sequence: Escape Sequences. (line 61) * common extensions, BINMODE variable: PC Using. (line 33) * common extensions, delete to delete entire arrays: Delete. (line 39) -* common extensions, func keyword: Definition Syntax. (line 93) +* common extensions, func keyword: Definition Syntax. (line 98) * common extensions, length() applied to an array: String Functions. (line 201) * common extensions, RS as a regexp: gawk split records. (line 6) @@ -32155,7 +32441,7 @@ Index * compatibility mode (gawk), octal numbers: Nondecimal-numbers. (line 60) * compatibility mode (gawk), specifying: Options. (line 81) -* compiled programs <1>: Glossary. (line 155) +* compiled programs <1>: Glossary. (line 219) * compiled programs: Basic High Level. (line 15) * compiling gawk for Cygwin: Cygwin. (line 6) * compiling gawk for MS-DOS and MS-Windows: PC Compiling. (line 13) @@ -32172,9 +32458,9 @@ Index * configuration option, --disable-lint: Additional Configuration Options. (line 15) * configuration option, --disable-nls: Additional Configuration Options. - (line 30) + (line 32) * configuration option, --with-whiny-user-strftime: Additional Configuration Options. - (line 35) + (line 37) * configuration options, gawk: Additional Configuration Options. (line 6) * constant regexps: Regexp Usage. (line 57) @@ -32187,9 +32473,9 @@ Index * control statements: Statements. (line 6) * controlling array scanning order: Controlling Scanning. (line 14) -* convert string to lower case: String Functions. (line 523) -* convert string to number: String Functions. (line 390) -* convert string to upper case: String Functions. (line 529) +* convert string to lower case: String Functions. (line 524) +* convert string to number: String Functions. (line 391) +* convert string to upper case: String Functions. (line 530) * converting integer array subscripts: Numeric Array Subscripts. (line 31) * converting, dates to timestamps: Time Functions. (line 76) @@ -32201,7 +32487,7 @@ Index * CONVFMT variable: Strings And Numbers. (line 29) * CONVFMT variable, and array subscripts: Numeric Array Subscripts. (line 6) -* cookie: Glossary. (line 177) +* cookie: Glossary. (line 258) * coprocesses <1>: Two-way I/O. (line 25) * coprocesses: Redirection. (line 96) * coprocesses, closing: Close Files And Pipes. @@ -32225,7 +32511,7 @@ Index * cut.awk program: Cut Program. (line 45) * d debugger command (alias for delete): Breakpoint Control. (line 64) * d.c., See dark corner: Conventions. (line 42) -* dark corner <1>: Glossary. (line 188) +* dark corner <1>: Glossary. (line 269) * dark corner: Conventions. (line 42) * dark corner, "0" is actually true: Truth Values. (line 24) * dark corner, /= operator vs. /=.../ regexp constant: Assignment Ops. @@ -32267,7 +32553,7 @@ Index (line 148) * dark corner, regexp constants, as arguments to user-defined functions: Using Constant Regexps. (line 43) -* dark corner, split() function: String Functions. (line 361) +* dark corner, split() function: String Functions. (line 362) * dark corner, strings, storing: gawk split records. (line 83) * dark corner, value of ARGV[0]: Auto-set. (line 39) * data, fixed-width: Constant Size. (line 6) @@ -32282,11 +32568,11 @@ Index * Davies, Stephen <1>: Contributors. (line 74) * Davies, Stephen: Acknowledgments. (line 60) * Day, Robert P.J.: Acknowledgments. (line 78) -* dcgettext <1>: Programmer i18n. (line 19) +* dcgettext <1>: Programmer i18n. (line 20) * dcgettext: I18N Functions. (line 22) * dcgettext() function (gawk), portability and: I18N Portability. (line 33) -* dcngettext <1>: Programmer i18n. (line 36) +* dcngettext <1>: Programmer i18n. (line 37) * dcngettext: I18N Functions. (line 28) * dcngettext() function (gawk), portability and: I18N Portability. (line 33) @@ -32373,7 +32659,7 @@ Index * debugger commands, t (tbreak): Breakpoint Control. (line 90) * debugger commands, tbreak: Breakpoint Control. (line 90) * debugger commands, trace: Miscellaneous Debugger Commands. - (line 108) + (line 107) * debugger commands, u (until): Debugger Execution Control. (line 83) * debugger commands, undisplay: Viewing And Changing Data. @@ -32389,18 +32675,18 @@ Index (line 67) * debugger commands, where (backtrace): Execution Stack. (line 13) * debugger default list amount: Debugger Info. (line 69) -* debugger history file: Debugger Info. (line 80) +* debugger history file: Debugger Info. (line 81) * debugger history size: Debugger Info. (line 65) * debugger options: Debugger Info. (line 57) -* debugger prompt: Debugger Info. (line 77) +* debugger prompt: Debugger Info. (line 78) * debugger, how to start: Debugger Invocation. (line 6) -* debugger, read commands from a file: Debugger Info. (line 96) +* debugger, read commands from a file: Debugger Info. (line 97) * debugging awk programs: Debugger. (line 6) * debugging gawk, bug reports: Bugs. (line 9) * decimal point character, locale specific: Options. (line 270) * decrement operators: Increment Ops. (line 35) * default keyword: Switch Statement. (line 6) -* Deifik, Scott <1>: Bugs. (line 70) +* Deifik, Scott <1>: Bugs. (line 71) * Deifik, Scott <2>: Contributors. (line 53) * Deifik, Scott: Acknowledgments. (line 60) * delete ARRAY: Delete. (line 39) @@ -32459,7 +32745,7 @@ Index (line 6) * differences in awk and gawk, line continuations: Conditional Exp. (line 34) -* differences in awk and gawk, LINT variable: User-modified. (line 88) +* differences in awk and gawk, LINT variable: User-modified. (line 87) * differences in awk and gawk, match() function: String Functions. (line 263) * differences in awk and gawk, print/printf statements: Format Modifiers. @@ -32510,7 +32796,7 @@ Index * dollar sign ($), incrementing fields and arrays: Increment Ops. (line 30) * dollar sign ($), regexp operator: Regexp Operators. (line 35) -* double quote ("), in regexp constants: Computed Regexps. (line 29) +* double quote ("), in regexp constants: Computed Regexps. (line 30) * double quote ("), in shell commands: Quoting. (line 54) * down debugger command: Execution Stack. (line 23) * Drepper, Ulrich: Acknowledgments. (line 52) @@ -32552,7 +32838,7 @@ Index * END pattern, and profiling: Profiling. (line 62) * END pattern, assert() user-defined function and: Assert Function. (line 75) -* END pattern, Boolean patterns and: Expression Patterns. (line 69) +* END pattern, Boolean patterns and: Expression Patterns. (line 70) * END pattern, exit statement and: Exit Statement. (line 12) * END pattern, next/nextfile statements and <1>: Next Statement. (line 44) @@ -32561,10 +32847,10 @@ Index * END pattern, operators and: Using BEGIN/END. (line 17) * END pattern, print statement and: I/O And BEGIN/END. (line 16) * ENDFILE pattern: BEGINFILE/ENDFILE. (line 6) -* ENDFILE pattern, Boolean patterns and: Expression Patterns. (line 69) -* endfile() user-defined function: Filetrans Function. (line 61) -* endgrent() function (C library): Group Functions. (line 211) -* endgrent() user-defined function: Group Functions. (line 214) +* ENDFILE pattern, Boolean patterns and: Expression Patterns. (line 70) +* endfile() user-defined function: Filetrans Function. (line 62) +* endgrent() function (C library): Group Functions. (line 212) +* endgrent() user-defined function: Group Functions. (line 215) * endpwent() function (C library): Passwd Functions. (line 207) * endpwent() user-defined function: Passwd Functions. (line 210) * English, Steve: Advanced Features. (line 6) @@ -32572,7 +32858,7 @@ Index * environment variables used by gawk: Environment Variables. (line 6) * environment variables, in ENVIRON array: Auto-set. (line 60) -* epoch, definition of: Glossary. (line 234) +* epoch, definition of: Glossary. (line 315) * equals sign (=), = operator: Assignment Ops. (line 6) * equals sign (=), == operator <1>: Precedence. (line 65) * equals sign (=), == operator: Comparison Operators. @@ -32620,7 +32906,7 @@ Index (line 99) * exp: Numeric Functions. (line 33) * expand utility: Very Simple. (line 73) -* Expat XML parser library: gawkextlib. (line 33) +* Expat XML parser library: gawkextlib. (line 35) * exponent: Numeric Functions. (line 33) * expressions: Expressions. (line 6) * expressions, as patterns: Expression Patterns. (line 6) @@ -32658,7 +32944,7 @@ Index * extensions, common, BINMODE variable: PC Using. (line 33) * extensions, common, delete to delete entire arrays: Delete. (line 39) * extensions, common, fflush() function: I/O Functions. (line 43) -* extensions, common, func keyword: Definition Syntax. (line 93) +* extensions, common, func keyword: Definition Syntax. (line 98) * extensions, common, length() applied to an array: String Functions. (line 201) * extensions, common, RS as a regexp: gawk split records. (line 6) @@ -32725,7 +33011,7 @@ Index * FILENAME variable, getline, setting with: Getline Notes. (line 19) * filenames, assignments as: Ignoring Assigns. (line 6) * files, .gmo: Explaining gettext. (line 42) -* files, .gmo, specifying directory of <1>: Programmer i18n. (line 47) +* files, .gmo, specifying directory of <1>: Programmer i18n. (line 48) * files, .gmo, specifying directory of: Explaining gettext. (line 54) * files, .mo, converting from .po: I18N Example. (line 64) * files, .po <1>: Translator i18n. (line 6) @@ -32752,7 +33038,7 @@ Index * files, message object, converting from portable object files: I18N Example. (line 64) * files, message object, specifying directory of <1>: Programmer i18n. - (line 47) + (line 48) * files, message object, specifying directory of: Explaining gettext. (line 54) * files, multiple passes over: Other Arguments. (line 56) @@ -32804,7 +33090,7 @@ Index * format time string: Time Functions. (line 48) * formats, numeric output: OFMT. (line 6) * formatting output: Printf. (line 6) -* formatting strings: String Functions. (line 383) +* formatting strings: String Functions. (line 384) * forward slash (/) to enclose regular expressions: Regexp. (line 10) * forward slash (/), / operator: Precedence. (line 55) * forward slash (/), /= operator <1>: Precedence. (line 95) @@ -32818,10 +33104,10 @@ Index * frame debugger command: Execution Stack. (line 27) * Free Documentation License (FDL): GNU Free Documentation License. (line 7) -* Free Software Foundation (FSF) <1>: Glossary. (line 288) +* Free Software Foundation (FSF) <1>: Glossary. (line 375) * Free Software Foundation (FSF) <2>: Getting. (line 10) * Free Software Foundation (FSF): Manual History. (line 6) -* FreeBSD: Glossary. (line 611) +* FreeBSD: Glossary. (line 753) * FS variable <1>: User-modified. (line 50) * FS variable: Field Separators. (line 15) * FS variable, --field-separator option and: Options. (line 21) @@ -32835,7 +33121,7 @@ Index * FS, containing ^: Regexp Field Splitting. (line 59) * FS, in multiline records: Multiple Line. (line 41) -* FSF (Free Software Foundation) <1>: Glossary. (line 288) +* FSF (Free Software Foundation) <1>: Glossary. (line 375) * FSF (Free Software Foundation) <2>: Getting. (line 10) * FSF (Free Software Foundation): Manual History. (line 6) * fts() extension function: Extension Sample File Functions. @@ -32875,7 +33161,7 @@ Index * functions, library, user database, reading: Passwd Functions. (line 6) * functions, names of: Definition Syntax. (line 23) -* functions, recursive: Definition Syntax. (line 83) +* functions, recursive: Definition Syntax. (line 88) * functions, string-translation: I18N Functions. (line 6) * functions, undefined: Pass By Value/Reference. (line 68) @@ -32896,7 +33182,7 @@ Index * gawk, awk and: Preface. (line 21) * gawk, bitwise operations in: Bitwise Functions. (line 40) * gawk, break statement in: Break Statement. (line 51) -* gawk, character classes and: Bracket Expressions. (line 100) +* gawk, character classes and: Bracket Expressions. (line 101) * gawk, coding style in: Adding Code. (line 38) * gawk, command-line options, and regular expressions: GNU Regexp Operators. (line 70) @@ -32931,7 +33217,7 @@ Index * gawk, IGNORECASE variable in <1>: Array Sorting Functions. (line 83) * gawk, IGNORECASE variable in <2>: String Functions. (line 58) -* gawk, IGNORECASE variable in <3>: Array Intro. (line 94) +* gawk, IGNORECASE variable in <3>: Array Intro. (line 100) * gawk, IGNORECASE variable in <4>: User-modified. (line 76) * gawk, IGNORECASE variable in: Case-sensitivity. (line 26) * gawk, implementation issues: Notes. (line 6) @@ -32947,7 +33233,7 @@ Index (line 6) * gawk, interval expressions and: Regexp Operators. (line 139) * gawk, line continuation in: Conditional Exp. (line 34) -* gawk, LINT variable in: User-modified. (line 88) +* gawk, LINT variable in: User-modified. (line 87) * gawk, list of contributors to: Contributors. (line 6) * gawk, MS-DOS version of: PC Using. (line 10) * gawk, MS-Windows version of: PC Using. (line 10) @@ -32988,7 +33274,7 @@ Index * gawkpath_append shell function: Shell Startup Files. (line 19) * gawkpath_default shell function: Shell Startup Files. (line 12) * gawkpath_prepend shell function: Shell Startup Files. (line 15) -* General Public License (GPL): Glossary. (line 305) +* General Public License (GPL): Glossary. (line 399) * General Public License, See GPL: Manual History. (line 11) * generate time values: Time Functions. (line 25) * gensub <1>: String Functions. (line 90) @@ -32998,12 +33284,12 @@ Index * getaddrinfo() function (C library): TCP/IP Networking. (line 38) * getgrent() function (C library): Group Functions. (line 6) * getgrent() user-defined function: Group Functions. (line 6) -* getgrgid() function (C library): Group Functions. (line 182) -* getgrgid() user-defined function: Group Functions. (line 185) -* getgrnam() function (C library): Group Functions. (line 171) -* getgrnam() user-defined function: Group Functions. (line 176) -* getgruser() function (C library): Group Functions. (line 191) -* getgruser() function, user-defined: Group Functions. (line 194) +* getgrgid() function (C library): Group Functions. (line 183) +* getgrgid() user-defined function: Group Functions. (line 186) +* getgrnam() function (C library): Group Functions. (line 172) +* getgrnam() user-defined function: Group Functions. (line 177) +* getgruser() function (C library): Group Functions. (line 192) +* getgruser() function, user-defined: Group Functions. (line 195) * getline command: Reading Files. (line 20) * getline command, _gr_init() user-defined function: Group Functions. (line 83) @@ -33020,7 +33306,7 @@ Index * getline from a file: Getline/File. (line 6) * getline into a variable: Getline/Variable. (line 6) * getline statement, BEGINFILE/ENDFILE patterns and: BEGINFILE/ENDFILE. - (line 54) + (line 53) * getlocaltime() user-defined function: Getlocaltime Function. (line 16) * getopt() function (C library): Getopt Function. (line 15) @@ -33040,24 +33326,24 @@ Index * git utility <2>: Accessing The Source. (line 10) * git utility <3>: Other Versions. (line 29) -* git utility: gawkextlib. (line 27) +* git utility: gawkextlib. (line 29) * Git, use of for gawk source code: Derived Files. (line 6) * GNITS mailing list: Acknowledgments. (line 52) * GNU awk, See gawk: Preface. (line 51) * GNU Free Documentation License: GNU Free Documentation License. (line 7) -* GNU General Public License: Glossary. (line 305) -* GNU Lesser General Public License: Glossary. (line 396) +* GNU General Public License: Glossary. (line 399) +* GNU Lesser General Public License: Glossary. (line 496) * GNU long options <1>: Options. (line 6) * GNU long options: Command Line. (line 13) * GNU long options, printing list of: Options. (line 154) -* GNU Project <1>: Glossary. (line 314) +* GNU Project <1>: Glossary. (line 408) * GNU Project: Manual History. (line 11) -* GNU/Linux <1>: Glossary. (line 611) +* GNU/Linux <1>: Glossary. (line 753) * GNU/Linux <2>: I18N Example. (line 55) * GNU/Linux: Manual History. (line 28) * Gordon, Assaf: Contributors. (line 105) -* GPL (General Public License) <1>: Glossary. (line 305) +* GPL (General Public License) <1>: Glossary. (line 399) * GPL (General Public License): Manual History. (line 11) * GPL (General Public License), printing: Options. (line 88) * grcat program: Group Functions. (line 16) @@ -33069,7 +33355,7 @@ Index * gsub <1>: String Functions. (line 140) * gsub: Using Constant Regexps. (line 43) -* gsub() function, arguments of: String Functions. (line 462) +* gsub() function, arguments of: String Functions. (line 463) * gsub() function, escape processing: Gory Details. (line 6) * h debugger command (alias for help): Miscellaneous Debugger Commands. (line 66) @@ -33096,7 +33382,7 @@ Index * hyphen (-), in bracket expressions: Bracket Expressions. (line 17) * i debugger command (alias for info): Debugger Info. (line 13) * id utility: Id Program. (line 6) -* id.awk program: Id Program. (line 30) +* id.awk program: Id Program. (line 31) * if statement: If Statement. (line 6) * if statement, actions, changing: Ranges. (line 25) * if statement, use of regexps in: Regexp Usage. (line 19) @@ -33104,7 +33390,7 @@ Index * ignore breakpoint: Breakpoint Control. (line 87) * ignore debugger command: Breakpoint Control. (line 87) * IGNORECASE variable: User-modified. (line 76) -* IGNORECASE variable, and array indices: Array Intro. (line 94) +* IGNORECASE variable, and array indices: Array Intro. (line 100) * IGNORECASE variable, and array sorting functions: Array Sorting Functions. (line 83) * IGNORECASE variable, in example programs: Library Functions. @@ -33164,7 +33450,7 @@ Index * insomnia, cure for: Alarm Program. (line 6) * installation, VMS: VMS Installation. (line 6) * installing gawk: Installation. (line 6) -* instruction tracing, in debugger: Debugger Info. (line 89) +* instruction tracing, in debugger: Debugger Info. (line 90) * int: Numeric Functions. (line 38) * INT signal (MS-Windows): Profiling. (line 213) * integer array indices: Numeric Array Subscripts. @@ -33172,37 +33458,37 @@ Index * integers, arbitrary precision: Arbitrary Precision Integers. (line 6) * integers, unsigned: Computer Arithmetic. (line 41) -* interacting with other programs: I/O Functions. (line 106) +* interacting with other programs: I/O Functions. (line 107) * internationalization <1>: I18N and L10N. (line 6) * internationalization: I18N Functions. (line 6) * internationalization, localization <1>: Internationalization. (line 13) * internationalization, localization: User-modified. (line 151) * internationalization, localization, character classes: Bracket Expressions. - (line 100) + (line 101) * internationalization, localization, gawk and: Internationalization. (line 13) * internationalization, localization, locale categories: Explaining gettext. (line 81) * internationalization, localization, marked strings: Programmer i18n. - (line 14) + (line 13) * internationalization, localization, portability and: I18N Portability. (line 6) * internationalizing a program: Explaining gettext. (line 6) -* interpreted programs <1>: Glossary. (line 356) +* interpreted programs <1>: Glossary. (line 450) * interpreted programs: Basic High Level. (line 15) * interval expressions, regexp operator: Regexp Operators. (line 116) * inventory-shipped file: Sample Data Files. (line 32) -* invoke shell command: I/O Functions. (line 106) +* invoke shell command: I/O Functions. (line 107) * isarray: Type Functions. (line 11) -* ISO: Glossary. (line 367) -* ISO 8859-1: Glossary. (line 133) -* ISO Latin-1: Glossary. (line 133) +* ISO: Glossary. (line 461) +* ISO 8859-1: Glossary. (line 197) +* ISO Latin-1: Glossary. (line 197) * Jacobs, Andrew: Passwd Functions. (line 90) * Jaegermann, Michal <1>: Contributors. (line 45) * Jaegermann, Michal: Acknowledgments. (line 60) * Java implementation of awk: Other Versions. (line 117) -* Java programming language: Glossary. (line 379) +* Java programming language: Glossary. (line 473) * jawk: Other Versions. (line 117) * Jedi knights: Undocumented. (line 6) * Johansen, Chris: Signature Program. (line 25) @@ -33211,7 +33497,7 @@ Index * Kahrs, Ju"rgen: Acknowledgments. (line 60) * Kasal, Stepan: Acknowledgments. (line 60) * Kenobi, Obi-Wan: Undocumented. (line 6) -* Kernighan, Brian <1>: Glossary. (line 143) +* Kernighan, Brian <1>: Glossary. (line 207) * Kernighan, Brian <2>: Basic Data Typing. (line 54) * Kernighan, Brian <3>: Other Versions. (line 13) * Kernighan, Brian <4>: Contributors. (line 11) @@ -33252,8 +33538,8 @@ Index * length: String Functions. (line 171) * length of input record: String Functions. (line 178) * length of string: String Functions. (line 171) -* Lesser General Public License (LGPL): Glossary. (line 396) -* LGPL (Lesser General Public License): Glossary. (line 396) +* Lesser General Public License (LGPL): Glossary. (line 496) +* LGPL (Lesser General Public License): Glossary. (line 496) * libmawk: Other Versions. (line 125) * libraries of awk functions: Library Functions. (line 6) * libraries of awk functions, assertions: Assert Function. (line 6) @@ -33287,7 +33573,7 @@ Index * lines, duplicate, removing: History Sorting. (line 6) * lines, matching ranges of: Ranges. (line 6) * lines, skipping between markers: Ranges. (line 43) -* lint checking: User-modified. (line 88) +* lint checking: User-modified. (line 87) * lint checking, array elements: Delete. (line 34) * lint checking, array subscripts: Uninitialized Subscripts. (line 43) @@ -33297,8 +33583,8 @@ Index (line 339) * lint checking, undefined functions: Pass By Value/Reference. (line 85) -* LINT variable: User-modified. (line 88) -* Linux <1>: Glossary. (line 611) +* LINT variable: User-modified. (line 87) +* Linux <1>: Glossary. (line 753) * Linux <2>: I18N Example. (line 55) * Linux: Manual History. (line 28) * list all global variables, in debugger: Debugger Info. (line 48) @@ -33338,7 +33624,7 @@ Index * mail-list file: Sample Data Files. (line 6) * mailing labels, printing: Labels Program. (line 6) * mailing list, GNITS: Acknowledgments. (line 52) -* Malmberg, John <1>: Bugs. (line 70) +* Malmberg, John <1>: Bugs. (line 71) * Malmberg, John: Acknowledgments. (line 60) * Malmberg, John E.: Contributors. (line 137) * mark parity: Ordinal Functions. (line 45) @@ -33353,20 +33639,20 @@ Index * matching, expressions, See comparison expressions: Typing and Comparison. (line 9) * matching, leftmost longest: Multiple Line. (line 26) -* matching, null strings: String Functions. (line 536) +* matching, null strings: String Functions. (line 537) * mawk utility <1>: Other Versions. (line 48) * mawk utility <2>: Nextfile Statement. (line 47) * mawk utility <3>: Concatenation. (line 36) * mawk utility <4>: Getline/Pipe. (line 62) * mawk utility: Escape Sequences. (line 120) * maximum precision supported by MPFR library: Auto-set. (line 235) -* McIlroy, Doug: Glossary. (line 177) +* McIlroy, Doug: Glossary. (line 258) * McPhee, Patrick: Contributors. (line 100) * message object files: Explaining gettext. (line 42) * message object files, converting from portable object files: I18N Example. (line 64) * message object files, specifying directory of <1>: Programmer i18n. - (line 47) + (line 48) * message object files, specifying directory of: Explaining gettext. (line 54) * messages from extensions: Printing Messages. (line 6) @@ -33388,7 +33674,7 @@ Index * names, functions: Definition Syntax. (line 23) * namespace issues: Library Names. (line 6) * namespace issues, functions: Definition Syntax. (line 23) -* NetBSD: Glossary. (line 611) +* NetBSD: Glossary. (line 753) * networks, programming: TCP/IP Networking. (line 6) * networks, support for: Special Network. (line 6) * newlines <1>: Boolean Ops. (line 69) @@ -33397,8 +33683,8 @@ Index * newlines, as field separators: Default Field Splitting. (line 6) * newlines, as record separators: awk split records. (line 12) -* newlines, in dynamic regexps: Computed Regexps. (line 59) -* newlines, in regexp constants: Computed Regexps. (line 69) +* newlines, in dynamic regexps: Computed Regexps. (line 60) +* newlines, in regexp constants: Computed Regexps. (line 70) * newlines, printing: Print Examples. (line 12) * newlines, separating statements in actions <1>: Statements. (line 10) * newlines, separating statements in actions: Action Overview. @@ -33444,7 +33730,7 @@ Index (line 43) * null strings, converting numbers to strings: Strings And Numbers. (line 21) -* null strings, matching: String Functions. (line 536) +* null strings, matching: String Functions. (line 537) * number as string of bits: Bitwise Functions. (line 110) * number of array elements: String Functions. (line 201) * number sign (#), #! (executable scripts): Executable Scripts. @@ -33469,14 +33755,14 @@ Index * obsolete features: Obsolete. (line 6) * octal numbers: Nondecimal-numbers. (line 6) * octal values, enabling interpretation of: Options. (line 211) -* OFMT variable <1>: User-modified. (line 105) +* OFMT variable <1>: User-modified. (line 104) * OFMT variable <2>: Strings And Numbers. (line 57) * OFMT variable: OFMT. (line 15) * OFMT variable, POSIX awk and: OFMT. (line 27) * OFS variable <1>: User-modified. (line 113) * OFS variable <2>: Output Separators. (line 6) * OFS variable: Changing Fields. (line 64) -* OpenBSD: Glossary. (line 611) +* OpenBSD: Glossary. (line 753) * OpenSolaris: Other Versions. (line 100) * operating systems, BSD-based: Manual History. (line 28) * operating systems, PC, gawk on: PC Using. (line 6) @@ -33577,7 +33863,7 @@ Index (line 6) * pipe, input: Getline/Pipe. (line 9) * pipe, output: Redirection. (line 57) -* Pitts, Dave <1>: Bugs. (line 70) +* Pitts, Dave <1>: Bugs. (line 71) * Pitts, Dave: Acknowledgments. (line 60) * Plauger, P.J.: Library Functions. (line 12) * plug-in: Extension Intro. (line 6) @@ -33602,7 +33888,7 @@ Index (line 65) * portability, deleting array elements: Delete. (line 56) * portability, example programs: Library Functions. (line 42) -* portability, functions, defining: Definition Syntax. (line 109) +* portability, functions, defining: Definition Syntax. (line 114) * portability, gawk: New Ports. (line 6) * portability, gettext library and: Explaining gettext. (line 11) * portability, internationalization and: I18N Portability. (line 6) @@ -33614,7 +33900,7 @@ Index * portability, operators: Increment Ops. (line 60) * portability, operators, not in POSIX awk: Precedence. (line 98) * portability, POSIXLY_CORRECT environment variable: Options. (line 359) -* portability, substr() function: String Functions. (line 512) +* portability, substr() function: String Functions. (line 513) * portable object files <1>: Translator i18n. (line 6) * portable object files: Explaining gettext. (line 37) * portable object files, converting to message object files: I18N Example. @@ -33647,7 +33933,7 @@ Index * POSIX awk, field separators and <1>: Full Line Fields. (line 16) * POSIX awk, field separators and: Fields. (line 6) * POSIX awk, FS variable and: User-modified. (line 60) -* POSIX awk, function keyword in: Definition Syntax. (line 93) +* POSIX awk, function keyword in: Definition Syntax. (line 98) * POSIX awk, functions and, gsub()/sub(): Gory Details. (line 90) * POSIX awk, functions and, length(): String Functions. (line 180) * POSIX awk, GNU long options and: Options. (line 15) @@ -33740,7 +34026,7 @@ Index * programming conventions, functions, calling: Calling Built-in. (line 10) * programming conventions, functions, writing: Definition Syntax. - (line 65) + (line 70) * programming conventions, gawk extensions: Internal File Ops. (line 45) * programming conventions, private variable names: Library Names. @@ -33749,13 +34035,13 @@ Index * programming languages, Ada: Glossary. (line 11) * programming languages, data-driven vs. procedural: Getting Started. (line 12) -* programming languages, Java: Glossary. (line 379) +* programming languages, Java: Glossary. (line 473) * programming, basic steps: Basic High Level. (line 20) * programming, concepts: Basic Concepts. (line 6) * pwcat program: Passwd Functions. (line 23) * q debugger command (alias for quit): Miscellaneous Debugger Commands. (line 99) -* QSE Awk: Other Versions. (line 135) +* QSE awk: Other Versions. (line 135) * Quanstrom, Erik: Alarm Program. (line 8) * question mark (?), ?: operator: Precedence. (line 92) * question mark (?), regexp operator <1>: GNU Regexp Operators. @@ -33809,8 +34095,8 @@ Index * records, splitting input into: Records. (line 6) * records, terminating: awk split records. (line 125) * records, treating files as: gawk split records. (line 93) -* recursive functions: Definition Syntax. (line 83) -* redirect gawk output, in debugger: Debugger Info. (line 72) +* recursive functions: Definition Syntax. (line 88) +* redirect gawk output, in debugger: Debugger Info. (line 73) * redirection of input: Getline/File. (line 6) * redirection of output: Redirection. (line 6) * reference counting, sorting arrays: Array Sorting Functions. @@ -33824,8 +34110,8 @@ Index * regexp constants, as patterns: Expression Patterns. (line 34) * regexp constants, in gawk: Using Constant Regexps. (line 28) -* regexp constants, slashes vs. quotes: Computed Regexps. (line 29) -* regexp constants, vs. string constants: Computed Regexps. (line 39) +* regexp constants, slashes vs. quotes: Computed Regexps. (line 30) +* regexp constants, vs. string constants: Computed Regexps. (line 40) * register extension: Registration Functions. (line 6) * regular expressions: Regexp. (line 6) @@ -33844,7 +34130,7 @@ Index (line 57) * regular expressions, dynamic: Computed Regexps. (line 6) * regular expressions, dynamic, with embedded newlines: Computed Regexps. - (line 59) + (line 60) * regular expressions, gawk, command-line options: GNU Regexp Operators. (line 70) * regular expressions, interval expressions and: Options. (line 279) @@ -33863,7 +34149,7 @@ Index * regular expressions, searching for: Egrep Program. (line 6) * relational operators, See comparison operators: Typing and Comparison. (line 9) -* replace in string: String Functions. (line 408) +* replace in string: String Functions. (line 409) * return debugger command: Debugger Execution Control. (line 54) * return statement, user-defined functions: Return Statement. (line 6) @@ -33874,7 +34160,7 @@ Index (line 11) * revtwoway extension: Extension Sample Rev2way. (line 12) -* rewind() user-defined function: Rewind Function. (line 16) +* rewind() user-defined function: Rewind Function. (line 15) * right angle bracket (>), > operator <1>: Precedence. (line 65) * right angle bracket (>), > operator: Comparison Operators. (line 11) @@ -33890,7 +34176,7 @@ Index * RLENGTH variable: Auto-set. (line 266) * RLENGTH variable, match() function and: String Functions. (line 228) * Robbins, Arnold <1>: Future Extensions. (line 6) -* Robbins, Arnold <2>: Bugs. (line 70) +* Robbins, Arnold <2>: Bugs. (line 71) * Robbins, Arnold <3>: Contributors. (line 144) * Robbins, Arnold <4>: General Data Types. (line 6) * Robbins, Arnold <5>: Alarm Program. (line 6) @@ -33929,7 +34215,7 @@ Index * sample debugging session: Sample Debugging Session. (line 6) * sandbox mode: Options. (line 286) -* save debugger options: Debugger Info. (line 84) +* save debugger options: Debugger Info. (line 85) * scalar or array: Type Functions. (line 11) * scalar values: Basic Data Typing. (line 13) * scanning arrays: Scanning an Array. (line 6) @@ -33978,7 +34264,7 @@ Index * set directory of message catalogs: I18N Functions. (line 12) * set watchpoint: Viewing And Changing Data. (line 67) -* shadowing of variable values: Definition Syntax. (line 71) +* shadowing of variable values: Definition Syntax. (line 76) * shell quoting, rules for: Quoting. (line 6) * shells, piping commands into: Redirection. (line 136) * shells, quoting: Using Shell Variables. @@ -34020,14 +34306,14 @@ Index (line 14) * sidebar, Changing NR and FNR: Auto-set. (line 326) * sidebar, Controlling Output Buffering with system(): I/O Functions. - (line 138) + (line 139) * sidebar, Escape Sequences for Metacharacters: Escape Sequences. (line 137) * sidebar, FS and IGNORECASE: Field Splitting Summary. (line 38) * sidebar, Interactive Versus Noninteractive Buffering: I/O Functions. - (line 73) -* sidebar, Matching the Null String: String Functions. (line 534) + (line 74) +* sidebar, Matching the Null String: String Functions. (line 535) * sidebar, Operator Evaluation Order: Increment Ops. (line 58) * sidebar, Piping into sh: Redirection. (line 134) * sidebar, Pre-POSIX awk Used OFMT for String Conversion: Strings And Numbers. @@ -34035,13 +34321,13 @@ Index * sidebar, Recipe for a Programming Language: History. (line 6) * sidebar, RS = "\0" Is Not Portable: gawk split records. (line 63) * sidebar, So Why Does gawk Have BEGINFILE and ENDFILE?: Filetrans Function. - (line 82) + (line 83) * sidebar, Syntactic Ambiguities Between /= and Regular Expressions: Assignment Ops. (line 146) * sidebar, Understanding #!: Executable Scripts. (line 31) * sidebar, Understanding $0: Changing Fields. (line 134) * sidebar, Using \n in Bracket Expressions of Dynamic Regexps: Computed Regexps. - (line 57) + (line 58) * sidebar, Using close()'s Return Value: Close Files And Pipes. (line 131) * SIGHUP signal, for dynamic profiling: Profiling. (line 210) @@ -34081,7 +34367,7 @@ Index (line 94) * source code, awka: Other Versions. (line 68) * source code, Brian Kernighan's awk: Other Versions. (line 13) -* source code, Busybox Awk: Other Versions. (line 92) +* source code, BusyBox Awk: Other Versions. (line 92) * source code, gawk: Gawk Distribution. (line 6) * source code, Illumos awk: Other Versions. (line 109) * source code, jawk: Other Versions. (line 117) @@ -34090,18 +34376,18 @@ Index * source code, mixing: Options. (line 117) * source code, pawk: Other Versions. (line 82) * source code, pawk (Python version): Other Versions. (line 129) -* source code, QSE Awk: Other Versions. (line 135) +* source code, QSE awk: Other Versions. (line 135) * source code, QuikTrim Awk: Other Versions. (line 139) * source code, Solaris awk: Other Versions. (line 100) * source files, search path for: Programs Exercises. (line 70) -* sparse arrays: Array Intro. (line 72) +* sparse arrays: Array Intro. (line 76) * Spencer, Henry: Glossary. (line 16) * split: String Functions. (line 316) * split string into array: String Functions. (line 297) * split utility: Split Program. (line 6) * split() function, array elements, deleting: Delete. (line 61) * split.awk program: Split Program. (line 30) -* sprintf <1>: String Functions. (line 383) +* sprintf <1>: String Functions. (line 384) * sprintf: OFMT. (line 15) * sprintf() function, OFMT variable and: User-modified. (line 113) * sprintf() function, print/printf statements and: Round Function. @@ -34111,7 +34397,7 @@ Index * square root: Numeric Functions. (line 92) * srand: Numeric Functions. (line 96) * stack frame: Debugging Terms. (line 10) -* Stallman, Richard <1>: Glossary. (line 288) +* Stallman, Richard <1>: Glossary. (line 375) * Stallman, Richard <2>: Contributors. (line 23) * Stallman, Richard <3>: Acknowledgments. (line 18) * Stallman, Richard: Manual History. (line 6) @@ -34135,7 +34421,7 @@ Index * stream editors: Full Line Fields. (line 22) * strftime: Time Functions. (line 48) * string constants: Scalar Constants. (line 15) -* string constants, vs. regexp constants: Computed Regexps. (line 39) +* string constants, vs. regexp constants: Computed Regexps. (line 40) * string extraction (internationalization): String Extraction. (line 6) * string length: String Functions. (line 171) @@ -34147,23 +34433,23 @@ Index * strings splitting, example: String Functions. (line 335) * strings, converting <1>: Bitwise Functions. (line 110) * strings, converting: Strings And Numbers. (line 6) -* strings, converting letter case: String Functions. (line 522) +* strings, converting letter case: String Functions. (line 523) * strings, converting, numbers to: User-modified. (line 30) * strings, empty, See null strings: awk split records. (line 115) * strings, extracting: String Extraction. (line 6) -* strings, for localization: Programmer i18n. (line 14) +* strings, for localization: Programmer i18n. (line 13) * strings, length limitations: Scalar Constants. (line 20) * strings, merging arrays into: Join Function. (line 6) * strings, null: Regexp Field Splitting. (line 43) * strings, numeric: Variable Typing. (line 6) -* strtonum: String Functions. (line 390) +* strtonum: String Functions. (line 391) * strtonum() function (gawk), --non-decimal-data option and: Nondecimal Data. (line 35) -* sub <1>: String Functions. (line 408) +* sub <1>: String Functions. (line 409) * sub: Using Constant Regexps. (line 43) -* sub() function, arguments of: String Functions. (line 462) +* sub() function, arguments of: String Functions. (line 463) * sub() function, escape processing: Gory Details. (line 6) * subscript separators: User-modified. (line 145) * subscripts in arrays, multidimensional: Multidimensional. (line 10) @@ -34177,15 +34463,15 @@ Index * SUBSEP variable, and multidimensional arrays: Multidimensional. (line 16) * substitute in string: String Functions. (line 90) -* substr: String Functions. (line 481) -* substring: String Functions. (line 481) +* substr: String Functions. (line 482) +* substring: String Functions. (line 482) * Sumner, Andrew: Other Versions. (line 68) * supplementary groups of gawk process: Auto-set. (line 251) * switch statement: Switch Statement. (line 6) * SYMTAB array: Auto-set. (line 283) * syntactic ambiguity: /= operator vs. /=.../ regexp constant: Assignment Ops. (line 148) -* system: I/O Functions. (line 106) +* system: I/O Functions. (line 107) * systime: Time Functions. (line 66) * t debugger command (alias for tbreak): Breakpoint Control. (line 90) * tbreak debugger command: Breakpoint Control. (line 90) @@ -34211,7 +34497,7 @@ Index (line 6) * text, printing: Print. (line 22) * text, printing, unduplicated lines of: Uniq Program. (line 6) -* TEXTDOMAIN variable <1>: Programmer i18n. (line 9) +* TEXTDOMAIN variable <1>: Programmer i18n. (line 8) * TEXTDOMAIN variable: User-modified. (line 151) * TEXTDOMAIN variable, BEGIN pattern and: Programmer i18n. (line 60) * TEXTDOMAIN variable, portability and: I18N Portability. (line 20) @@ -34235,11 +34521,11 @@ Index * timestamps, converting dates to: Time Functions. (line 76) * timestamps, formatted: Getlocaltime Function. (line 6) -* tolower: String Functions. (line 523) -* toupper: String Functions. (line 529) +* tolower: String Functions. (line 524) +* toupper: String Functions. (line 530) * tr utility: Translate Program. (line 6) * trace debugger command: Miscellaneous Debugger Commands. - (line 108) + (line 107) * traceback, display in debugger: Execution Stack. (line 13) * translate string: I18N Functions. (line 22) * translate.awk program: Translate Program. (line 55) @@ -34255,14 +34541,14 @@ Index (line 22) * troubleshooting, fatal errors, printf format strings: Format Modifiers. (line 158) -* troubleshooting, fflush() function: I/O Functions. (line 62) +* troubleshooting, fflush() function: I/O Functions. (line 63) * troubleshooting, function call syntax: Function Calls. (line 30) * troubleshooting, gawk: Compatibility Mode. (line 6) * troubleshooting, gawk, bug reports: Bugs. (line 9) * troubleshooting, gawk, fatal errors, function arguments: Calling Built-in. (line 16) * troubleshooting, getline function: File Checking. (line 25) -* troubleshooting, gsub()/sub() functions: String Functions. (line 472) +* troubleshooting, gsub()/sub() functions: String Functions. (line 473) * troubleshooting, match() function: String Functions. (line 292) * troubleshooting, print statement, omitting commas: Print Examples. (line 31) @@ -34270,10 +34556,10 @@ Index * troubleshooting, quotes with file names: Special FD. (line 62) * troubleshooting, readable data files: File Checking. (line 6) * troubleshooting, regexp constants vs. string constants: Computed Regexps. - (line 39) + (line 40) * troubleshooting, string concatenation: Concatenation. (line 26) -* troubleshooting, substr() function: String Functions. (line 499) -* troubleshooting, system() function: I/O Functions. (line 128) +* troubleshooting, substr() function: String Functions. (line 500) +* troubleshooting, system() function: I/O Functions. (line 129) * troubleshooting, typographical errors, global variables: Options. (line 98) * true, logical: Truth Values. (line 6) @@ -34296,14 +34582,14 @@ Index * undisplay debugger command: Viewing And Changing Data. (line 80) * undocumented features: Undocumented. (line 6) -* Unicode <1>: Glossary. (line 133) +* Unicode <1>: Glossary. (line 197) * Unicode <2>: Ranges and Locales. (line 61) * Unicode: Ordinal Functions. (line 45) * uninitialized variables, as array subscripts: Uninitialized Subscripts. (line 6) * uniq utility: Uniq Program. (line 6) * uniq.awk program: Uniq Program. (line 65) -* Unix: Glossary. (line 611) +* Unix: Glossary. (line 753) * Unix awk, backslashes in escape sequences: Escape Sequences. (line 120) * Unix awk, close() function and: Close Files And Pipes. @@ -34352,7 +34638,7 @@ Index * variables, predefined conveying information: Auto-set. (line 6) * variables, private: Library Names. (line 11) * variables, setting: Options. (line 32) -* variables, shadowing: Definition Syntax. (line 71) +* variables, shadowing: Definition Syntax. (line 76) * variables, types of: Assignment Ops. (line 40) * variables, types of, comparison expressions and: Typing and Comparison. (line 9) @@ -34419,7 +34705,7 @@ Index * xor: Bitwise Functions. (line 56) * XOR bitwise operation: Bitwise Functions. (line 6) * Yawitz, Efraim: Contributors. (line 131) -* Zaretskii, Eli <1>: Bugs. (line 70) +* Zaretskii, Eli <1>: Bugs. (line 71) * Zaretskii, Eli <2>: Contributors. (line 55) * Zaretskii, Eli: Acknowledgments. (line 60) * zerofile.awk program: Empty Files. (line 21) @@ -34452,560 +34738,561 @@ Index Tag Table: Node: Top1204 -Node: Foreword342225 -Node: Foreword446669 -Node: Preface48200 -Ref: Preface-Footnote-151071 -Ref: Preface-Footnote-251178 -Ref: Preface-Footnote-351411 -Node: History51553 -Node: Names53904 -Ref: Names-Footnote-154997 -Node: This Manual55143 -Ref: This Manual-Footnote-161643 -Node: Conventions61743 -Node: Manual History64080 -Ref: Manual History-Footnote-167073 -Ref: Manual History-Footnote-267114 -Node: How To Contribute67188 -Node: Acknowledgments68317 -Node: Getting Started73134 -Node: Running gawk75573 -Node: One-shot76763 -Node: Read Terminal78027 -Node: Long80058 -Node: Executable Scripts81571 -Ref: Executable Scripts-Footnote-184360 -Node: Comments84463 -Node: Quoting86945 -Node: DOS Quoting92463 -Node: Sample Data Files93138 -Node: Very Simple95733 -Node: Two Rules100632 -Node: More Complex102518 -Node: Statements/Lines105380 -Ref: Statements/Lines-Footnote-1109835 -Node: Other Features110100 -Node: When111031 -Ref: When-Footnote-1112785 -Node: Intro Summary112850 -Node: Invoking Gawk113733 -Node: Command Line115247 -Node: Options116045 -Ref: Options-Footnote-1131849 -Ref: Options-Footnote-2132078 -Node: Other Arguments132103 -Node: Naming Standard Input135051 -Node: Environment Variables136144 -Node: AWKPATH Variable136702 -Ref: AWKPATH Variable-Footnote-1140115 -Ref: AWKPATH Variable-Footnote-2140160 -Node: AWKLIBPATH Variable140420 -Node: Other Environment Variables141676 -Node: Exit Status145164 -Node: Include Files145840 -Node: Loading Shared Libraries149437 -Node: Obsolete150864 -Node: Undocumented151561 -Node: Invoking Summary151828 -Node: Regexp153492 -Node: Regexp Usage154946 -Node: Escape Sequences156983 -Node: Regexp Operators163224 -Ref: Regexp Operators-Footnote-1170650 -Ref: Regexp Operators-Footnote-2170797 -Node: Bracket Expressions170895 -Ref: table-char-classes172910 -Node: Leftmost Longest175834 -Node: Computed Regexps177136 -Node: GNU Regexp Operators180533 -Node: Case-sensitivity184206 -Ref: Case-sensitivity-Footnote-1187091 -Ref: Case-sensitivity-Footnote-2187326 -Node: Regexp Summary187434 -Node: Reading Files188901 -Node: Records190995 -Node: awk split records191728 -Node: gawk split records196643 -Ref: gawk split records-Footnote-1201187 -Node: Fields201224 -Ref: Fields-Footnote-1204000 -Node: Nonconstant Fields204086 -Ref: Nonconstant Fields-Footnote-1206329 -Node: Changing Fields206533 -Node: Field Separators212462 -Node: Default Field Splitting215167 -Node: Regexp Field Splitting216284 -Node: Single Character Fields219634 -Node: Command Line Field Separator220693 -Node: Full Line Fields223905 -Ref: Full Line Fields-Footnote-1225422 -Ref: Full Line Fields-Footnote-2225468 -Node: Field Splitting Summary225569 -Node: Constant Size227643 -Node: Splitting By Content232232 -Ref: Splitting By Content-Footnote-1236226 -Node: Multiple Line236389 -Ref: Multiple Line-Footnote-1242275 -Node: Getline242454 -Node: Plain Getline244666 -Node: Getline/Variable247306 -Node: Getline/File248454 -Node: Getline/Variable/File249838 -Ref: Getline/Variable/File-Footnote-1251441 -Node: Getline/Pipe251528 -Node: Getline/Variable/Pipe254211 -Node: Getline/Coprocess255342 -Node: Getline/Variable/Coprocess256594 -Node: Getline Notes257333 -Node: Getline Summary260125 -Ref: table-getline-variants260537 -Node: Read Timeout261366 -Ref: Read Timeout-Footnote-1265190 -Node: Command-line directories265248 -Node: Input Summary266153 -Node: Input Exercises269454 -Node: Printing270182 -Node: Print271959 -Node: Print Examples273416 -Node: Output Separators276195 -Node: OFMT278213 -Node: Printf279567 -Node: Basic Printf280352 -Node: Control Letters281922 -Node: Format Modifiers285905 -Node: Printf Examples291914 -Node: Redirection294400 -Node: Special FD301241 -Ref: Special FD-Footnote-1304401 -Node: Special Files304475 -Node: Other Inherited Files305092 -Node: Special Network306092 -Node: Special Caveats306954 -Node: Close Files And Pipes307905 -Ref: Close Files And Pipes-Footnote-1315087 -Ref: Close Files And Pipes-Footnote-2315235 -Node: Output Summary315385 -Node: Output Exercises316383 -Node: Expressions317063 -Node: Values318248 -Node: Constants318926 -Node: Scalar Constants319617 -Ref: Scalar Constants-Footnote-1320476 -Node: Nondecimal-numbers320726 -Node: Regexp Constants323744 -Node: Using Constant Regexps324269 -Node: Variables327412 -Node: Using Variables328067 -Node: Assignment Options329978 -Node: Conversion331853 -Node: Strings And Numbers332377 -Ref: Strings And Numbers-Footnote-1335442 -Node: Locale influences conversions335551 -Ref: table-locale-affects338298 -Node: All Operators338886 -Node: Arithmetic Ops339516 -Node: Concatenation342021 -Ref: Concatenation-Footnote-1344840 -Node: Assignment Ops344946 -Ref: table-assign-ops349925 -Node: Increment Ops351197 -Node: Truth Values and Conditions354635 -Node: Truth Values355720 -Node: Typing and Comparison356769 -Node: Variable Typing357579 -Node: Comparison Operators361232 -Ref: table-relational-ops361642 -Node: POSIX String Comparison365137 -Ref: POSIX String Comparison-Footnote-1366209 -Node: Boolean Ops366347 -Ref: Boolean Ops-Footnote-1370826 -Node: Conditional Exp370917 -Node: Function Calls372644 -Node: Precedence376524 -Node: Locales380185 -Node: Expressions Summary381817 -Node: Patterns and Actions384377 -Node: Pattern Overview385497 -Node: Regexp Patterns387176 -Node: Expression Patterns387719 -Node: Ranges391429 -Node: BEGIN/END394535 -Node: Using BEGIN/END395296 -Ref: Using BEGIN/END-Footnote-1398030 -Node: I/O And BEGIN/END398136 -Node: BEGINFILE/ENDFILE400450 -Node: Empty403351 -Node: Using Shell Variables403668 -Node: Action Overview405941 -Node: Statements408267 -Node: If Statement410115 -Node: While Statement411610 -Node: Do Statement413639 -Node: For Statement414783 -Node: Switch Statement417940 -Node: Break Statement420322 -Node: Continue Statement422363 -Node: Next Statement424190 -Node: Nextfile Statement426571 -Node: Exit Statement429201 -Node: Built-in Variables431604 -Node: User-modified432737 -Ref: User-modified-Footnote-1440418 -Node: Auto-set440480 -Ref: Auto-set-Footnote-1454172 -Ref: Auto-set-Footnote-2454377 -Node: ARGC and ARGV454433 -Node: Pattern Action Summary458651 -Node: Arrays461078 -Node: Array Basics462407 -Node: Array Intro463251 -Ref: figure-array-elements465215 -Ref: Array Intro-Footnote-1467741 -Node: Reference to Elements467869 -Node: Assigning Elements470321 -Node: Array Example470812 -Node: Scanning an Array472570 -Node: Controlling Scanning475586 -Ref: Controlling Scanning-Footnote-1480782 -Node: Numeric Array Subscripts481098 -Node: Uninitialized Subscripts483283 -Node: Delete484900 -Ref: Delete-Footnote-1487643 -Node: Multidimensional487700 -Node: Multiscanning490797 -Node: Arrays of Arrays492386 -Node: Arrays Summary497145 -Node: Functions499237 -Node: Built-in500136 -Node: Calling Built-in501214 -Node: Numeric Functions503205 -Ref: Numeric Functions-Footnote-1508024 -Ref: Numeric Functions-Footnote-2508381 -Ref: Numeric Functions-Footnote-3508429 -Node: String Functions508701 -Ref: String Functions-Footnote-1532176 -Ref: String Functions-Footnote-2532305 -Ref: String Functions-Footnote-3532553 -Node: Gory Details532640 -Ref: table-sub-escapes534421 -Ref: table-sub-proposed535941 -Ref: table-posix-sub537305 -Ref: table-gensub-escapes538841 -Ref: Gory Details-Footnote-1539673 -Node: I/O Functions539824 -Ref: I/O Functions-Footnote-1547042 -Node: Time Functions547189 -Ref: Time Functions-Footnote-1557677 -Ref: Time Functions-Footnote-2557745 -Ref: Time Functions-Footnote-3557903 -Ref: Time Functions-Footnote-4558014 -Ref: Time Functions-Footnote-5558126 -Ref: Time Functions-Footnote-6558353 -Node: Bitwise Functions558619 -Ref: table-bitwise-ops559181 -Ref: Bitwise Functions-Footnote-1563490 -Node: Type Functions563659 -Node: I18N Functions564810 -Node: User-defined566455 -Node: Definition Syntax567260 -Ref: Definition Syntax-Footnote-1572667 -Node: Function Example572738 -Ref: Function Example-Footnote-1575657 -Node: Function Caveats575679 -Node: Calling A Function576197 -Node: Variable Scope577155 -Node: Pass By Value/Reference580143 -Node: Return Statement583638 -Node: Dynamic Typing586619 -Node: Indirect Calls587548 -Ref: Indirect Calls-Footnote-1598850 -Node: Functions Summary598978 -Node: Library Functions601680 -Ref: Library Functions-Footnote-1605289 -Ref: Library Functions-Footnote-2605432 -Node: Library Names605603 -Ref: Library Names-Footnote-1609057 -Ref: Library Names-Footnote-2609280 -Node: General Functions609366 -Node: Strtonum Function610469 -Node: Assert Function613491 -Node: Round Function616815 -Node: Cliff Random Function618356 -Node: Ordinal Functions619372 -Ref: Ordinal Functions-Footnote-1622435 -Ref: Ordinal Functions-Footnote-2622687 -Node: Join Function622898 -Ref: Join Function-Footnote-1624667 -Node: Getlocaltime Function624867 -Node: Readfile Function628611 -Node: Shell Quoting630581 -Node: Data File Management631982 -Node: Filetrans Function632614 -Node: Rewind Function636670 -Node: File Checking638057 -Ref: File Checking-Footnote-1639389 -Node: Empty Files639590 -Node: Ignoring Assigns641569 -Node: Getopt Function643120 -Ref: Getopt Function-Footnote-1654582 -Node: Passwd Functions654782 -Ref: Passwd Functions-Footnote-1663619 -Node: Group Functions663707 -Ref: Group Functions-Footnote-1671601 -Node: Walking Arrays671814 -Node: Library Functions Summary673417 -Node: Library Exercises674818 -Node: Sample Programs676098 -Node: Running Examples676868 -Node: Clones677596 -Node: Cut Program678820 -Node: Egrep Program688539 -Ref: Egrep Program-Footnote-1696037 -Node: Id Program696147 -Node: Split Program699792 -Ref: Split Program-Footnote-1703240 -Node: Tee Program703368 -Node: Uniq Program706157 -Node: Wc Program713576 -Ref: Wc Program-Footnote-1717826 -Node: Miscellaneous Programs717920 -Node: Dupword Program719133 -Node: Alarm Program721164 -Node: Translate Program725968 -Ref: Translate Program-Footnote-1730533 -Node: Labels Program730803 -Ref: Labels Program-Footnote-1734154 -Node: Word Sorting734238 -Node: History Sorting738309 -Node: Extract Program740145 -Node: Simple Sed747670 -Node: Igawk Program750738 -Ref: Igawk Program-Footnote-1765062 -Ref: Igawk Program-Footnote-2765263 -Ref: Igawk Program-Footnote-3765385 -Node: Anagram Program765500 -Node: Signature Program768557 -Node: Programs Summary769804 -Node: Programs Exercises770997 -Ref: Programs Exercises-Footnote-1775128 -Node: Advanced Features775219 -Node: Nondecimal Data777167 -Node: Array Sorting778757 -Node: Controlling Array Traversal779454 -Ref: Controlling Array Traversal-Footnote-1787787 -Node: Array Sorting Functions787905 -Ref: Array Sorting Functions-Footnote-1791794 -Node: Two-way I/O791990 -Ref: Two-way I/O-Footnote-1796935 -Ref: Two-way I/O-Footnote-2797121 -Node: TCP/IP Networking797203 -Node: Profiling800076 -Node: Advanced Features Summary808353 -Node: Internationalization810286 -Node: I18N and L10N811766 -Node: Explaining gettext812452 -Ref: Explaining gettext-Footnote-1817477 -Ref: Explaining gettext-Footnote-2817661 -Node: Programmer i18n817826 -Ref: Programmer i18n-Footnote-1822692 -Node: Translator i18n822741 -Node: String Extraction823535 -Ref: String Extraction-Footnote-1824666 -Node: Printf Ordering824752 -Ref: Printf Ordering-Footnote-1827538 -Node: I18N Portability827602 -Ref: I18N Portability-Footnote-1830057 -Node: I18N Example830120 -Ref: I18N Example-Footnote-1832923 -Node: Gawk I18N832995 -Node: I18N Summary833633 -Node: Debugger834972 -Node: Debugging835994 -Node: Debugging Concepts836435 -Node: Debugging Terms838288 -Node: Awk Debugging840860 -Node: Sample Debugging Session841754 -Node: Debugger Invocation842274 -Node: Finding The Bug843658 -Node: List of Debugger Commands850133 -Node: Breakpoint Control851466 -Node: Debugger Execution Control855162 -Node: Viewing And Changing Data858526 -Node: Execution Stack861904 -Node: Debugger Info863541 -Node: Miscellaneous Debugger Commands867558 -Node: Readline Support872587 -Node: Limitations873479 -Node: Debugging Summary875593 -Node: Arbitrary Precision Arithmetic876761 -Node: Computer Arithmetic878177 -Ref: table-numeric-ranges881775 -Ref: Computer Arithmetic-Footnote-1882634 -Node: Math Definitions882691 -Ref: table-ieee-formats885979 -Ref: Math Definitions-Footnote-1886583 -Node: MPFR features886688 -Node: FP Math Caution888359 -Ref: FP Math Caution-Footnote-1889409 -Node: Inexactness of computations889778 -Node: Inexact representation890737 -Node: Comparing FP Values892094 -Node: Errors accumulate893176 -Node: Getting Accuracy894609 -Node: Try To Round897271 -Node: Setting precision898170 -Ref: table-predefined-precision-strings898854 -Node: Setting the rounding mode900643 -Ref: table-gawk-rounding-modes901007 -Ref: Setting the rounding mode-Footnote-1904462 -Node: Arbitrary Precision Integers904641 -Ref: Arbitrary Precision Integers-Footnote-1909540 -Node: POSIX Floating Point Problems909689 -Ref: POSIX Floating Point Problems-Footnote-1913562 -Node: Floating point summary913600 -Node: Dynamic Extensions915794 -Node: Extension Intro917346 -Node: Plugin License918612 -Node: Extension Mechanism Outline919409 -Ref: figure-load-extension919837 -Ref: figure-register-new-function921317 -Ref: figure-call-new-function922321 -Node: Extension API Description924307 -Node: Extension API Functions Introduction925757 -Node: General Data Types930581 -Ref: General Data Types-Footnote-1936320 -Node: Memory Allocation Functions936619 -Ref: Memory Allocation Functions-Footnote-1939458 -Node: Constructor Functions939554 -Node: Registration Functions941288 -Node: Extension Functions941973 -Node: Exit Callback Functions944270 -Node: Extension Version String945518 -Node: Input Parsers946183 -Node: Output Wrappers956062 -Node: Two-way processors960577 -Node: Printing Messages962781 -Ref: Printing Messages-Footnote-1963857 -Node: Updating `ERRNO'964009 -Node: Requesting Values964749 -Ref: table-value-types-returned965477 -Node: Accessing Parameters966434 -Node: Symbol Table Access967665 -Node: Symbol table by name968179 -Node: Symbol table by cookie970160 -Ref: Symbol table by cookie-Footnote-1974304 -Node: Cached values974367 -Ref: Cached values-Footnote-1977866 -Node: Array Manipulation977957 -Ref: Array Manipulation-Footnote-1979055 -Node: Array Data Types979092 -Ref: Array Data Types-Footnote-1981747 -Node: Array Functions981839 -Node: Flattening Arrays985693 -Node: Creating Arrays992585 -Node: Extension API Variables997356 -Node: Extension Versioning997992 -Node: Extension API Informational Variables999893 -Node: Extension API Boilerplate1000958 -Node: Finding Extensions1004767 -Node: Extension Example1005327 -Node: Internal File Description1006099 -Node: Internal File Ops1010166 -Ref: Internal File Ops-Footnote-11021836 -Node: Using Internal File Ops1021976 -Ref: Using Internal File Ops-Footnote-11024359 -Node: Extension Samples1024632 -Node: Extension Sample File Functions1026158 -Node: Extension Sample Fnmatch1033796 -Node: Extension Sample Fork1035287 -Node: Extension Sample Inplace1036502 -Node: Extension Sample Ord1038177 -Node: Extension Sample Readdir1039013 -Ref: table-readdir-file-types1039889 -Node: Extension Sample Revout1040700 -Node: Extension Sample Rev2way1041290 -Node: Extension Sample Read write array1042030 -Node: Extension Sample Readfile1043970 -Node: Extension Sample Time1045065 -Node: Extension Sample API Tests1046414 -Node: gawkextlib1046905 -Node: Extension summary1049563 -Node: Extension Exercises1053252 -Node: Language History1053974 -Node: V7/SVR3.11055630 -Node: SVR41057811 -Node: POSIX1059256 -Node: BTL1060645 -Node: POSIX/GNU1061379 -Node: Feature History1067003 -Node: Common Extensions1080101 -Node: Ranges and Locales1081425 -Ref: Ranges and Locales-Footnote-11086043 -Ref: Ranges and Locales-Footnote-21086070 -Ref: Ranges and Locales-Footnote-31086304 -Node: Contributors1086525 -Node: History summary1092066 -Node: Installation1093436 -Node: Gawk Distribution1094382 -Node: Getting1094866 -Node: Extracting1095689 -Node: Distribution contents1097324 -Node: Unix Installation1103389 -Node: Quick Installation1104072 -Node: Shell Startup Files1106483 -Node: Additional Configuration Options1107562 -Node: Configuration Philosophy1109301 -Node: Non-Unix Installation1111670 -Node: PC Installation1112128 -Node: PC Binary Installation1113447 -Node: PC Compiling1115295 -Ref: PC Compiling-Footnote-11118316 -Node: PC Testing1118425 -Node: PC Using1119601 -Node: Cygwin1123716 -Node: MSYS1124539 -Node: VMS Installation1125039 -Node: VMS Compilation1125831 -Ref: VMS Compilation-Footnote-11127053 -Node: VMS Dynamic Extensions1127111 -Node: VMS Installation Details1128795 -Node: VMS Running1131047 -Node: VMS GNV1133883 -Node: VMS Old Gawk1134617 -Node: Bugs1135087 -Node: Other Versions1138970 -Node: Installation summary1145398 -Node: Notes1146454 -Node: Compatibility Mode1147319 -Node: Additions1148101 -Node: Accessing The Source1149026 -Node: Adding Code1150462 -Node: New Ports1156627 -Node: Derived Files1161109 -Ref: Derived Files-Footnote-11166584 -Ref: Derived Files-Footnote-21166618 -Ref: Derived Files-Footnote-31167214 -Node: Future Extensions1167328 -Node: Implementation Limitations1167934 -Node: Extension Design1169182 -Node: Old Extension Problems1170336 -Ref: Old Extension Problems-Footnote-11171853 -Node: Extension New Mechanism Goals1171910 -Ref: Extension New Mechanism Goals-Footnote-11175270 -Node: Extension Other Design Decisions1175459 -Node: Extension Future Growth1177567 -Node: Old Extension Mechanism1178403 -Node: Notes summary1180165 -Node: Basic Concepts1181351 -Node: Basic High Level1182032 -Ref: figure-general-flow1182304 -Ref: figure-process-flow1182903 -Ref: Basic High Level-Footnote-11186132 -Node: Basic Data Typing1186317 -Node: Glossary1189645 -Node: Copying1214803 -Node: GNU Free Documentation License1252359 -Node: Index1277495 +Node: Foreword342291 +Node: Foreword446735 +Node: Preface48266 +Ref: Preface-Footnote-151137 +Ref: Preface-Footnote-251244 +Ref: Preface-Footnote-351477 +Node: History51619 +Node: Names53970 +Ref: Names-Footnote-155064 +Node: This Manual55210 +Ref: This Manual-Footnote-161710 +Node: Conventions61810 +Node: Manual History64147 +Ref: Manual History-Footnote-167140 +Ref: Manual History-Footnote-267181 +Node: How To Contribute67255 +Node: Acknowledgments68384 +Node: Getting Started73250 +Node: Running gawk75689 +Node: One-shot76879 +Node: Read Terminal78143 +Node: Long80174 +Node: Executable Scripts81687 +Ref: Executable Scripts-Footnote-184476 +Node: Comments84579 +Node: Quoting87061 +Node: DOS Quoting92579 +Node: Sample Data Files93254 +Node: Very Simple95849 +Node: Two Rules100748 +Node: More Complex102634 +Node: Statements/Lines105496 +Ref: Statements/Lines-Footnote-1109951 +Node: Other Features110216 +Node: When111152 +Ref: When-Footnote-1112906 +Node: Intro Summary112971 +Node: Invoking Gawk113855 +Node: Command Line115369 +Node: Options116167 +Ref: Options-Footnote-1131962 +Ref: Options-Footnote-2132191 +Node: Other Arguments132216 +Node: Naming Standard Input135164 +Node: Environment Variables136257 +Node: AWKPATH Variable136815 +Ref: AWKPATH Variable-Footnote-1140222 +Ref: AWKPATH Variable-Footnote-2140267 +Node: AWKLIBPATH Variable140527 +Node: Other Environment Variables141783 +Node: Exit Status145414 +Node: Include Files146090 +Node: Loading Shared Libraries149679 +Node: Obsolete151106 +Node: Undocumented151798 +Node: Invoking Summary152065 +Node: Regexp153728 +Node: Regexp Usage155182 +Node: Escape Sequences157219 +Node: Regexp Operators163448 +Ref: Regexp Operators-Footnote-1170858 +Ref: Regexp Operators-Footnote-2171005 +Node: Bracket Expressions171103 +Ref: table-char-classes173118 +Node: Leftmost Longest176060 +Node: Computed Regexps177362 +Node: GNU Regexp Operators180791 +Node: Case-sensitivity184463 +Ref: Case-sensitivity-Footnote-1187348 +Ref: Case-sensitivity-Footnote-2187583 +Node: Regexp Summary187691 +Node: Reading Files189158 +Node: Records191251 +Node: awk split records191984 +Node: gawk split records196913 +Ref: gawk split records-Footnote-1201452 +Node: Fields201489 +Ref: Fields-Footnote-1204267 +Node: Nonconstant Fields204353 +Ref: Nonconstant Fields-Footnote-1206591 +Node: Changing Fields206794 +Node: Field Separators212725 +Node: Default Field Splitting215429 +Node: Regexp Field Splitting216546 +Node: Single Character Fields219896 +Node: Command Line Field Separator220955 +Node: Full Line Fields224172 +Ref: Full Line Fields-Footnote-1225693 +Ref: Full Line Fields-Footnote-2225739 +Node: Field Splitting Summary225840 +Node: Constant Size227914 +Node: Splitting By Content232493 +Ref: Splitting By Content-Footnote-1236458 +Node: Multiple Line236621 +Ref: Multiple Line-Footnote-1242502 +Node: Getline242681 +Node: Plain Getline244888 +Node: Getline/Variable247528 +Node: Getline/File248677 +Node: Getline/Variable/File250062 +Ref: Getline/Variable/File-Footnote-1251665 +Node: Getline/Pipe251752 +Node: Getline/Variable/Pipe254430 +Node: Getline/Coprocess255561 +Node: Getline/Variable/Coprocess256825 +Node: Getline Notes257564 +Node: Getline Summary260358 +Ref: table-getline-variants260770 +Node: Read Timeout261599 +Ref: Read Timeout-Footnote-1265436 +Node: Command-line directories265494 +Node: Input Summary266399 +Node: Input Exercises269784 +Node: Printing270512 +Node: Print272347 +Node: Print Examples273804 +Node: Output Separators276583 +Node: OFMT278601 +Node: Printf279956 +Node: Basic Printf280741 +Node: Control Letters282313 +Node: Format Modifiers286298 +Node: Printf Examples292304 +Node: Redirection294790 +Node: Special FD301628 +Ref: Special FD-Footnote-1304794 +Node: Special Files304868 +Node: Other Inherited Files305485 +Node: Special Network306485 +Node: Special Caveats307347 +Node: Close Files And Pipes308296 +Ref: Close Files And Pipes-Footnote-1315481 +Ref: Close Files And Pipes-Footnote-2315629 +Node: Nonfatal315779 +Node: Output Summary318104 +Node: Output Exercises319325 +Node: Expressions320005 +Node: Values321194 +Node: Constants321871 +Node: Scalar Constants322562 +Ref: Scalar Constants-Footnote-1323424 +Node: Nondecimal-numbers323674 +Node: Regexp Constants326684 +Node: Using Constant Regexps327210 +Node: Variables330373 +Node: Using Variables331030 +Node: Assignment Options332941 +Node: Conversion334816 +Node: Strings And Numbers335340 +Ref: Strings And Numbers-Footnote-1338405 +Node: Locale influences conversions338514 +Ref: table-locale-affects341260 +Node: All Operators341852 +Node: Arithmetic Ops342481 +Node: Concatenation344986 +Ref: Concatenation-Footnote-1347805 +Node: Assignment Ops347912 +Ref: table-assign-ops352891 +Node: Increment Ops354201 +Node: Truth Values and Conditions357632 +Node: Truth Values358715 +Node: Typing and Comparison359764 +Node: Variable Typing360580 +Node: Comparison Operators364247 +Ref: table-relational-ops364657 +Node: POSIX String Comparison368152 +Ref: POSIX String Comparison-Footnote-1369224 +Node: Boolean Ops369363 +Ref: Boolean Ops-Footnote-1373841 +Node: Conditional Exp373932 +Node: Function Calls375670 +Node: Precedence379550 +Node: Locales383210 +Node: Expressions Summary384842 +Node: Patterns and Actions387413 +Node: Pattern Overview388533 +Node: Regexp Patterns390212 +Node: Expression Patterns390755 +Node: Ranges394535 +Node: BEGIN/END397642 +Node: Using BEGIN/END398403 +Ref: Using BEGIN/END-Footnote-1401139 +Node: I/O And BEGIN/END401245 +Node: BEGINFILE/ENDFILE403560 +Node: Empty406457 +Node: Using Shell Variables406774 +Node: Action Overview409047 +Node: Statements411373 +Node: If Statement413221 +Node: While Statement414716 +Node: Do Statement416744 +Node: For Statement417892 +Node: Switch Statement421050 +Node: Break Statement423432 +Node: Continue Statement425525 +Node: Next Statement427352 +Node: Nextfile Statement429733 +Node: Exit Statement432361 +Node: Built-in Variables434772 +Node: User-modified435905 +Ref: User-modified-Footnote-1443539 +Node: Auto-set443601 +Ref: Auto-set-Footnote-1457310 +Ref: Auto-set-Footnote-2457515 +Node: ARGC and ARGV457571 +Node: Pattern Action Summary461789 +Node: Arrays464222 +Node: Array Basics465551 +Node: Array Intro466395 +Ref: figure-array-elements468332 +Ref: Array Intro-Footnote-1470955 +Node: Reference to Elements471083 +Node: Assigning Elements473545 +Node: Array Example474036 +Node: Scanning an Array475795 +Node: Controlling Scanning478818 +Ref: Controlling Scanning-Footnote-1484212 +Node: Numeric Array Subscripts484528 +Node: Uninitialized Subscripts486713 +Node: Delete488330 +Ref: Delete-Footnote-1491079 +Node: Multidimensional491136 +Node: Multiscanning494233 +Node: Arrays of Arrays495822 +Node: Arrays Summary500576 +Node: Functions502667 +Node: Built-in503706 +Node: Calling Built-in504784 +Node: Numeric Functions506779 +Ref: Numeric Functions-Footnote-1511597 +Ref: Numeric Functions-Footnote-2511954 +Ref: Numeric Functions-Footnote-3512002 +Node: String Functions512274 +Ref: String Functions-Footnote-1535775 +Ref: String Functions-Footnote-2535904 +Ref: String Functions-Footnote-3536152 +Node: Gory Details536239 +Ref: table-sub-escapes538020 +Ref: table-sub-proposed539535 +Ref: table-posix-sub540897 +Ref: table-gensub-escapes542434 +Ref: Gory Details-Footnote-1543267 +Node: I/O Functions543418 +Ref: I/O Functions-Footnote-1550654 +Node: Time Functions550801 +Ref: Time Functions-Footnote-1561310 +Ref: Time Functions-Footnote-2561378 +Ref: Time Functions-Footnote-3561536 +Ref: Time Functions-Footnote-4561647 +Ref: Time Functions-Footnote-5561759 +Ref: Time Functions-Footnote-6561986 +Node: Bitwise Functions562252 +Ref: table-bitwise-ops562814 +Ref: Bitwise Functions-Footnote-1567142 +Node: Type Functions567314 +Node: I18N Functions568466 +Node: User-defined570113 +Node: Definition Syntax570918 +Ref: Definition Syntax-Footnote-1576577 +Node: Function Example576648 +Ref: Function Example-Footnote-1579569 +Node: Function Caveats579591 +Node: Calling A Function580109 +Node: Variable Scope581067 +Node: Pass By Value/Reference584060 +Node: Return Statement587557 +Node: Dynamic Typing590536 +Node: Indirect Calls591465 +Ref: Indirect Calls-Footnote-1601330 +Node: Functions Summary601458 +Node: Library Functions604160 +Ref: Library Functions-Footnote-1607768 +Ref: Library Functions-Footnote-2607911 +Node: Library Names608082 +Ref: Library Names-Footnote-1611540 +Ref: Library Names-Footnote-2611763 +Node: General Functions611849 +Node: Strtonum Function612952 +Node: Assert Function615974 +Node: Round Function619298 +Node: Cliff Random Function620839 +Node: Ordinal Functions621855 +Ref: Ordinal Functions-Footnote-1624918 +Ref: Ordinal Functions-Footnote-2625170 +Node: Join Function625381 +Ref: Join Function-Footnote-1627151 +Node: Getlocaltime Function627351 +Node: Readfile Function631095 +Node: Shell Quoting633067 +Node: Data File Management634468 +Node: Filetrans Function635100 +Node: Rewind Function639196 +Node: File Checking640582 +Ref: File Checking-Footnote-1641915 +Node: Empty Files642116 +Node: Ignoring Assigns644095 +Node: Getopt Function645645 +Ref: Getopt Function-Footnote-1657109 +Node: Passwd Functions657309 +Ref: Passwd Functions-Footnote-1666149 +Node: Group Functions666237 +Ref: Group Functions-Footnote-1674134 +Node: Walking Arrays674339 +Node: Library Functions Summary677345 +Node: Library Exercises678747 +Node: Sample Programs680027 +Node: Running Examples680797 +Node: Clones681525 +Node: Cut Program682749 +Node: Egrep Program692469 +Ref: Egrep Program-Footnote-1699972 +Node: Id Program700082 +Node: Split Program703758 +Ref: Split Program-Footnote-1707212 +Node: Tee Program707340 +Node: Uniq Program710129 +Node: Wc Program717548 +Ref: Wc Program-Footnote-1721798 +Node: Miscellaneous Programs721892 +Node: Dupword Program723105 +Node: Alarm Program725136 +Node: Translate Program729941 +Ref: Translate Program-Footnote-1734504 +Node: Labels Program734774 +Ref: Labels Program-Footnote-1738125 +Node: Word Sorting738209 +Node: History Sorting742279 +Node: Extract Program744114 +Node: Simple Sed751638 +Node: Igawk Program754708 +Ref: Igawk Program-Footnote-1769034 +Ref: Igawk Program-Footnote-2769235 +Ref: Igawk Program-Footnote-3769357 +Node: Anagram Program769472 +Node: Signature Program772533 +Node: Programs Summary773780 +Node: Programs Exercises775001 +Ref: Programs Exercises-Footnote-1779132 +Node: Advanced Features779223 +Node: Nondecimal Data781205 +Node: Array Sorting782795 +Node: Controlling Array Traversal783495 +Ref: Controlling Array Traversal-Footnote-1791861 +Node: Array Sorting Functions791979 +Ref: Array Sorting Functions-Footnote-1795865 +Node: Two-way I/O796061 +Ref: Two-way I/O-Footnote-1801006 +Ref: Two-way I/O-Footnote-2801192 +Node: TCP/IP Networking801274 +Node: Profiling804146 +Node: Advanced Features Summary812417 +Node: Internationalization814350 +Node: I18N and L10N815830 +Node: Explaining gettext816516 +Ref: Explaining gettext-Footnote-1821541 +Ref: Explaining gettext-Footnote-2821725 +Node: Programmer i18n821890 +Ref: Programmer i18n-Footnote-1826766 +Node: Translator i18n826815 +Node: String Extraction827609 +Ref: String Extraction-Footnote-1828740 +Node: Printf Ordering828826 +Ref: Printf Ordering-Footnote-1831612 +Node: I18N Portability831676 +Ref: I18N Portability-Footnote-1834132 +Node: I18N Example834195 +Ref: I18N Example-Footnote-1836998 +Node: Gawk I18N837070 +Node: I18N Summary837714 +Node: Debugger839054 +Node: Debugging840076 +Node: Debugging Concepts840517 +Node: Debugging Terms842327 +Node: Awk Debugging844899 +Node: Sample Debugging Session845805 +Node: Debugger Invocation846339 +Node: Finding The Bug847724 +Node: List of Debugger Commands854203 +Node: Breakpoint Control855535 +Node: Debugger Execution Control859212 +Node: Viewing And Changing Data862571 +Node: Execution Stack865947 +Node: Debugger Info867582 +Node: Miscellaneous Debugger Commands871627 +Node: Readline Support876628 +Node: Limitations877522 +Node: Debugging Summary879637 +Node: Arbitrary Precision Arithmetic880811 +Node: Computer Arithmetic882227 +Ref: table-numeric-ranges885804 +Ref: Computer Arithmetic-Footnote-1886328 +Node: Math Definitions886385 +Ref: table-ieee-formats889680 +Ref: Math Definitions-Footnote-1890284 +Node: MPFR features890389 +Node: FP Math Caution892060 +Ref: FP Math Caution-Footnote-1893110 +Node: Inexactness of computations893479 +Node: Inexact representation894438 +Node: Comparing FP Values895796 +Node: Errors accumulate896878 +Node: Getting Accuracy898310 +Node: Try To Round901014 +Node: Setting precision901913 +Ref: table-predefined-precision-strings902597 +Node: Setting the rounding mode904426 +Ref: table-gawk-rounding-modes904790 +Ref: Setting the rounding mode-Footnote-1908242 +Node: Arbitrary Precision Integers908421 +Ref: Arbitrary Precision Integers-Footnote-1913319 +Node: POSIX Floating Point Problems913468 +Ref: POSIX Floating Point Problems-Footnote-1917347 +Node: Floating point summary917385 +Node: Dynamic Extensions919572 +Node: Extension Intro921124 +Node: Plugin License922389 +Node: Extension Mechanism Outline923186 +Ref: figure-load-extension923614 +Ref: figure-register-new-function925094 +Ref: figure-call-new-function926098 +Node: Extension API Description928085 +Node: Extension API Functions Introduction929535 +Node: General Data Types934356 +Ref: General Data Types-Footnote-1940256 +Node: Memory Allocation Functions940555 +Ref: Memory Allocation Functions-Footnote-1943394 +Node: Constructor Functions943493 +Node: Registration Functions945232 +Node: Extension Functions945917 +Node: Exit Callback Functions948214 +Node: Extension Version String949462 +Node: Input Parsers950125 +Node: Output Wrappers960000 +Node: Two-way processors964513 +Node: Printing Messages966776 +Ref: Printing Messages-Footnote-1967852 +Node: Updating `ERRNO'968004 +Node: Requesting Values968744 +Ref: table-value-types-returned969471 +Node: Accessing Parameters970428 +Node: Symbol Table Access971662 +Node: Symbol table by name972176 +Node: Symbol table by cookie974196 +Ref: Symbol table by cookie-Footnote-1978341 +Node: Cached values978404 +Ref: Cached values-Footnote-1981900 +Node: Array Manipulation981991 +Ref: Array Manipulation-Footnote-1983089 +Node: Array Data Types983126 +Ref: Array Data Types-Footnote-1985781 +Node: Array Functions985873 +Node: Flattening Arrays989732 +Node: Creating Arrays996634 +Node: Extension API Variables1001405 +Node: Extension Versioning1002041 +Node: Extension API Informational Variables1003932 +Node: Extension API Boilerplate1004997 +Node: Finding Extensions1008806 +Node: Extension Example1009366 +Node: Internal File Description1010138 +Node: Internal File Ops1014205 +Ref: Internal File Ops-Footnote-11025956 +Node: Using Internal File Ops1026096 +Ref: Using Internal File Ops-Footnote-11028479 +Node: Extension Samples1028752 +Node: Extension Sample File Functions1030280 +Node: Extension Sample Fnmatch1037961 +Node: Extension Sample Fork1039449 +Node: Extension Sample Inplace1040664 +Node: Extension Sample Ord1042340 +Node: Extension Sample Readdir1043176 +Ref: table-readdir-file-types1044053 +Node: Extension Sample Revout1044864 +Node: Extension Sample Rev2way1045453 +Node: Extension Sample Read write array1046193 +Node: Extension Sample Readfile1048133 +Node: Extension Sample Time1049228 +Node: Extension Sample API Tests1050576 +Node: gawkextlib1051067 +Node: Extension summary1053745 +Node: Extension Exercises1057434 +Node: Language History1058156 +Node: V7/SVR3.11059812 +Node: SVR41061965 +Node: POSIX1063399 +Node: BTL1064780 +Node: POSIX/GNU1065511 +Node: Feature History1071347 +Node: Common Extensions1085141 +Node: Ranges and Locales1086513 +Ref: Ranges and Locales-Footnote-11091132 +Ref: Ranges and Locales-Footnote-21091159 +Ref: Ranges and Locales-Footnote-31091394 +Node: Contributors1091615 +Node: History summary1097155 +Node: Installation1098534 +Node: Gawk Distribution1099480 +Node: Getting1099964 +Node: Extracting1100787 +Node: Distribution contents1102424 +Node: Unix Installation1108526 +Node: Quick Installation1109209 +Node: Shell Startup Files1111620 +Node: Additional Configuration Options1112699 +Node: Configuration Philosophy1114503 +Node: Non-Unix Installation1116872 +Node: PC Installation1117330 +Node: PC Binary Installation1118650 +Node: PC Compiling1120498 +Ref: PC Compiling-Footnote-11123519 +Node: PC Testing1123628 +Node: PC Using1124804 +Node: Cygwin1128919 +Node: MSYS1129689 +Node: VMS Installation1130190 +Node: VMS Compilation1130982 +Ref: VMS Compilation-Footnote-11132211 +Node: VMS Dynamic Extensions1132269 +Node: VMS Installation Details1133953 +Node: VMS Running1136204 +Node: VMS GNV1139044 +Node: VMS Old Gawk1139779 +Node: Bugs1140249 +Node: Other Versions1144138 +Node: Installation summary1150572 +Node: Notes1151631 +Node: Compatibility Mode1152496 +Node: Additions1153278 +Node: Accessing The Source1154203 +Node: Adding Code1155638 +Node: New Ports1161795 +Node: Derived Files1166277 +Ref: Derived Files-Footnote-11171752 +Ref: Derived Files-Footnote-21171786 +Ref: Derived Files-Footnote-31172382 +Node: Future Extensions1172496 +Node: Implementation Limitations1173102 +Node: Extension Design1174350 +Node: Old Extension Problems1175504 +Ref: Old Extension Problems-Footnote-11177021 +Node: Extension New Mechanism Goals1177078 +Ref: Extension New Mechanism Goals-Footnote-11180438 +Node: Extension Other Design Decisions1180627 +Node: Extension Future Growth1182735 +Node: Old Extension Mechanism1183571 +Node: Notes summary1185333 +Node: Basic Concepts1186519 +Node: Basic High Level1187200 +Ref: figure-general-flow1187472 +Ref: figure-process-flow1188071 +Ref: Basic High Level-Footnote-11191300 +Node: Basic Data Typing1191485 +Node: Glossary1194813 +Node: Copying1226742 +Node: GNU Free Documentation License1264298 +Node: Index1289434 End Tag Table |