diff options
Diffstat (limited to 'gawk-info-3')
-rw-r--r-- | gawk-info-3 | 1385 |
1 files changed, 0 insertions, 1385 deletions
diff --git a/gawk-info-3 b/gawk-info-3 deleted file mode 100644 index b333f57c..00000000 --- a/gawk-info-3 +++ /dev/null @@ -1,1385 +0,0 @@ -Info file gawk-info, produced by Makeinfo, -*- Text -*- from input -file gawk.texinfo. - -This file documents `awk', a program that you can use to select -particular records in a file and perform operations upon them. - -Copyright (C) 1989 Free Software Foundation, Inc. - -Permission is granted to make and distribute verbatim copies of this -manual provided the copyright notice and this permission notice are -preserved on all copies. - -Permission is granted to copy and distribute modified versions of -this manual under the conditions for verbatim copying, provided that -the entire resulting derived work is distributed under the terms of a -permission notice identical to this one. - -Permission is granted to copy and distribute translations of this -manual into another language, under the above conditions for modified -versions, except that this permission notice may be stated in a -translation approved by the Foundation. - - - -File: gawk-info, Node: Patterns, Next: Actions, Prev: One-liners, Up: Top - -Patterns -******** - -Patterns control the execution of rules: a rule is executed when its -pattern matches the input record. The `awk' language provides -several special patterns that are described in the sections that -follow. Patterns include: - -NULL - The empty pattern, which matches every input record. (*Note The - Empty Pattern: Empty.) - -/REGULAR EXPRESSION/ - A regular expression as a pattern. It matches when the text of - the input record fits the regular expression. (*Note Regular - Expressions as Patterns: Regexp.) - -CONDEXP - A single comparison expression. It matches when it is true. - (*Note Comparison Expressions as Patterns: Comparison Patterns.) - -`BEGIN' -`END' - Special patterns to supply start--up or clean--up information to - `awk'. (*Note Specifying Record Ranges With Patterns: BEGIN/END.) - -PAT1, PAT2 - A pair of patterns separated by a comma, specifying a range of - records. (*Note Specifying Record Ranges With Patterns: Ranges.) - -CONDEXP1 BOOLEAN CONDEXP2 - A "compound" pattern, which combines expressions with the - operators `and', `&&', and `or', `||'. (*Note Boolean - Operators and Patterns: Boolean.) - -! CONDEXP - The pattern CONDEXP is evaluated. Then the `!' performs a - boolean ``not'' or logical negation operation; if the input line - matches the pattern in CONDEXP then the associated action is - *not* executed. If the input line did not match that pattern, - then the action *is* executed. (*Note Boolean Operators and - Patterns: Boolean.) - -(EXPR) - Parentheses may be used to control how operators nest. - -PAT1 ? PAT2 : PAT3 - The first pattern is evaluated. If it is true, the input line - is tested against the second pattern, otherwise it is tested - against the third. (*Note Conditional Patterns: Conditional - Patterns.) - -* Menu: - -The following subsections describe these forms in detail: - -* Empty:: The empty pattern, which matches every record. - -* Regexp:: Regular expressions such as `/foo/'. - -* Comparison Patterns:: Comparison expressions such as `$1 > 10'. - -* Boolean:: Combining comparison expressions. - -* Ranges:: Using pairs of patterns to specify record ranges. - -* BEGIN/END:: Specifying initialization and cleanup rules. - -* Conditional Patterns:: Patterns such as `pat1 ? pat2 : pat3'. - - - -File: gawk-info, Node: Empty, Next: Regexp, Up: Patterns - -The Empty Pattern -================= - -An empty pattern is considered to match *every* input record. For -example, the program: - - awk '{ print $1 }' BBS-list - -prints just the first field of every record. - - - -File: gawk-info, Node: Regexp, Next: Comparison Patterns, Prev: Empty, Up: Patterns - -Regular Expressions as Patterns -=============================== - -A "regular expression", or "regexp", is a way of describing classes -of strings. When enclosed in slashes (`/'), it makes an `awk' -pattern that matches every input record that contains a match for the -regexp. - -The simplest regular expression is a sequence of letters, numbers, or -both. Such a regexp matches any string that contains that sequence. -Thus, the regexp `foo' matches any string containing `foo'. (More -complicated regexps let you specify classes of similar strings.) - -* Menu: - -* Usage: Regexp Usage. How regexps are used in patterns. -* Operators: Regexp Operators. How to write a regexp. - - - -File: gawk-info, Node: Regexp Usage, Next: Regexp Operators, Up: Regexp - -How to use Regular Expressions ------------------------------- - -When you enclose `foo' in slashes, you get a pattern that matches a -record that contains `foo'. For example, this prints the second -field of each record that contains `foo' anywhere: - - awk '/foo/ { print $2 }' BBS-list - -Regular expressions can also be used in comparison expressions. Then -you can specify the string to match against; it need not be the -entire current input record. These comparison expressions can be -used as patterns or in `if' and `while' statements. - -`EXP ~ /REGEXP/' - This is true if the expression EXP (taken as a character string) - is matched by REGEXP. The following example matches, or - selects, all input records with the letter `J' in the first field: - - awk '$1 ~ /J/' inventory-shipped - - So does this: - - awk '{ if ($1 ~ /J/) print }' inventory-shipped - -`EXP !~ /REGEXP/' - This is true if the expression EXP (taken as a character string) - is *not* matched by REGEXP. The following example matches, or - selects, all input records whose first field *does not* contain - the letter `J': - - awk '$1 !~ /J/' inventory-shipped - -The right hand side of a `~' or `!~' operator need not be a constant -regexp (i.e. a string of characters between `/'s). It can also be -"computed", or "dynamic". For example: - - identifier = "[A-Za-z_][A-Za-z_0-9]+" - $0 ~ identifier - -sets `identifier' to a regexp that describes `awk' variable names, -and tests if the input record matches this regexp. - -A dynamic regexp may actually be any expression. The expression is -evaluated, and the result is treated as a string that describes a -regular expression. - - - -File: gawk-info, Node: Regexp Operators, Prev: Regexp Usage, Up: Regexp - -Regular Expression Operators ----------------------------- - -You can combine regular expressions with the following characters, -called "regular expression operators", or "metacharacters", to -increase the power and versatility of regular expressions. This is a -table of metacharacters: - -`\' - This is used to suppress the special meaning of a character when - matching. For example: - - \$ - - matches the character `$'. - -`^' - This matches the beginning of the string or the beginning of a - line within the string. For example: - - ^@chapter - - matches the `@chapter' at the beginning of a string, and can be - used to identify chapter beginnings in Texinfo source files. - -`$' - This is similar to `^', but it matches only at the end of a - string or the end of a line within the string. For example: - - /p$/ - - as a pattern matches a record that ends with a `p'. - -`.' - This matches any single character except a newline. For example: - - .P - - matches any single character followed by a `P' in a string. - Using concatenation we can make regular expressions like `U.A', - which matches any three--character string that begins with `U' - and ends with `A'. - -`[...]' - This is called a "character set". It matches any one of a group - of characters that are enclosed in the square brackets. For - example: - - [MVX] - - matches any of the characters `M', `V', or `X' in a string. - - Ranges of characters are indicated by using a hyphen between the - beginning and ending characters, and enclosing the whole thing - in brackets. For example: - - [0-9] - - matches any string that contains a digit. - - Note that special patterns have to be followed to match the - characters, `]', `-', and `^' when they are enclosed in the - square brackets. To match a `]', make it the first character in - the set. For example: - - []d] - - matches either `]', or `d'. - - To match `-', write it as `--', which is a range containing only - `-'. You may also make the `-' be the first or last character - in the set. To match `^', make it any character except the - first one of a set. - -`[^ ...]' - This is the "complemented character set". The first character - after the `[' *must* be a `^'. This matches any characters - *except* those in the square brackets. For example: - - [^0-9] - - matches any characters that are not digits. - -`|' - This is the "alternation operator" and it is used to specify - alternatives. For example: - - ^P|[0-9] - - matches any string that matches either `^P' or `[0-9]'. This - means it matches any string that contains a digit or starts with - `P'. - -`(...)' - Parentheses are used for grouping in regular expressions as in - arithmetic. They can be used to concatenate regular expressions - containing the alternation operator, `|'. - -`*' - This symbol means that the preceding regular expression is to be - repeated as many times as possible to find a match. For example: - - ph* - - applies the `*' symbol to the preceding `h' and looks for - matches to one `p' followed by any number of `h''s. This will - also match just `p' if no `h''s are present. - - The `*' means repeat the *smallest* possible preceding - expression in order to find a match. The `awk' language - processes a `*' by matching as many repetitions as can be found. - For example: - - awk '/\(c[ad][ad]*r x\)/ { print }' sample - - matches every record in the input containing a string of the - form `(car x)', `(cdr x)', `(cadr x)', and so on. - -`+' - This symbol is similar to `*', but the preceding expression must - be matched at least once. This means that: - - wh+y - - would match `why' and `whhy' but not `wy', whereas `wh*y' would - match all three of these strings. And this is a simpler way of - writing the last `*' example: - - awk '/\(c[ad]+r x\)/ { print }' sample - -`?' - This symbol is similar to `*', but the preceding expression can - be matched once or not at all. For example: - - fe?d - - will match `fed' or `fd', but nothing else. - -In regular expressions, the `*', `+', and `?' operators have the -highest precedence, followed by concatenation, and finally by `|'. -As in arithmetic, parentheses can change how operators are grouped. - -Any other character stands for itself. However, it is important to -note that case in regular expressions *is* significant, both when -matching ordinary (i.e. non--metacharacter) characters, and inside -character sets. Thus a `w' in a regular expression matches only a -lower case `w' and not either an uppercase or lowercase `w'. When -you want to do a case--independent match, you have to use a character -set: `[Ww]'. - - - -File: gawk-info, Node: Comparison Patterns, Next: Ranges, Prev: Regexp, Up: Patterns - -Comparison Expressions as Patterns -================================== - -"Comparison patterns" use "relational operators" to compare strings -or numbers. The relational operators are the same as in C. Here is -a table of them: - -`X < Y' - True if X is less than Y. - -`X <= Y' - True if X is less than or equal to Y. - -`X > Y' - True if X is greater than Y. - -`X >= Y' - True if X is greater than or equal to Y. - -`X == Y' - True if X is equal to Y. - -`X != Y' - True if X is not equal to Y. - -Comparison expressions can be used as patterns to control whether a -rule is executed. The expression is evaluated for each input record -read, and the pattern is considered matched if the condition is "true". - -The operands of a relational operator are compared as numbers if they -are both numbers. Otherwise they are converted to, and compared as, -strings (*note Conversion::.). Strings are compared by comparing the -first character of each, then the second character of each, and so on. -Thus, `"10"' is less than `"9"'. - -The following example prints the second field of each input record -whose first field is precisely `foo'. - - awk '$1 == "foo" { print $2 }' BBS-list - -Contrast this with the following regular expression match, which -would accept any record with a first field that contains `foo': - - awk '$1 ~ "foo" { print $2 }' BBS-list - - - -File: gawk-info, Node: Ranges, Next: BEGIN/END, Prev: Comparison Patterns, Up: Patterns - -Specifying Record Ranges With Patterns -====================================== - -A "range pattern" is made of two patterns separated by a comma: -`BEGPAT, ENDPAT'. It matches ranges of consecutive input records. -The first pattern BEGPAT controls where the range begins, and the -second one ENDPAT controls where it ends. - -They work as follows: BEGPAT is matched against every input record; -when a record matches BEGPAT, the range pattern becomes "turned on". -The range pattern matches this record. As long as it stays turned -on, it automatically matches every input record read. But meanwhile, -ENDPAT is matched against every input record, and when it matches, -the range pattern is turned off again for the following record. Now -we go back to checking BEGPAT against each record. For example: - - awk '$1 == "on", $1 == "off"' - -prints every record between on/off pairs, inclusive. - -The record that turns on the range pattern and the one that turns it -off both match the range pattern. If you don't want to operate on -these records, you can write `if' statements in the rule's action to -distinguish them. - -It is possible for a pattern to be turned both on and off by the same -record, if both conditions are satisfied by that record. Then the -action is executed for just that record. - - - -File: gawk-info, Node: BEGIN/END, Next: Boolean, Prev: Ranges, Up: Patterns - -`BEGIN' and `END' Special Patterns -================================== - -`BEGIN' and `END' are special patterns. They are not used to match -input records. Rather, they are used for supplying start--up or -clean--up information to your `awk' script. A `BEGIN' rule is -executed, once, before the first input record has been read. An -`END' rule is executed, once, after all the input has been read. For -example: - - awk 'BEGIN { print "Analysis of ``foo'' program" } - /foo/ { ++foobar } - END { print "``foo'' appears " foobar " times." }' BBS-list - -This program finds out how many times the string `foo' appears in the -input file `BBS-list'. The `BEGIN' pattern prints out a title for -the report. There is no need to use the `BEGIN' pattern to -initialize the counter `foobar' to zero, as `awk' does this for us -automatically (*note Variables::.). The second rule increments the -variable `foobar' every time a record containing the pattern `foo' is -read. The last rule prints out the value of `foobar' at the end of -the run. - -The special patterns `BEGIN' and `END' do not combine with other -kinds of patterns. - -An `awk' program may have multiple `BEGIN' and/or `END' rules. The -contents of multiple `BEGIN' or `END' rules are treated as if they -had been enclosed in a single rule, in the order that the rules are -encountered in the `awk' program. (This feature was introduced with -the new version of `awk'.) - -Multiple `BEGIN' and `END' sections are also useful for writing -library functions that need to do initialization and/or cleanup of -their own. Note that the order in which library functions are named -on the command line will affect the order in which their `BEGIN' and -`END' rules will be executed. Therefore you have to be careful how -you write your library functions. (*Note Command Line::, for more -information on using library functions.) - -If an `awk' program only has a `BEGIN' rule, and no other rules, then -the program will exit after the `BEGIN' rule has been run. Older -versions of `awk' used to read their input until end of file was -seen. However, if an `END' rule exists as well, then the input will -be read, even if there are no other rules in the program. - -`BEGIN' and `END' rules must have actions; there is no default action -for these rules since there is no current record when they run. - - - -File: gawk-info, Node: Boolean, Next: Conditional Patterns, Prev: BEGIN/END, Up: Patterns - -Boolean Operators and Patterns -============================== - -A boolean pattern is a combination of other patterns using the -boolean operators ``or'' (`||'), ``and'' (`&&'), and ``not'' (`!'), -along with parentheses to control nesting. Whether the boolean -pattern matches an input record is computed from whether its -subpatterns match. - -The subpatterns of a boolean pattern can be regular expressions, -matching expressions, comparisons, or other boolean combinations of -such. Range patterns cannot appear inside boolean operators, since -they don't make sense for classifying a single record, and neither -can the special patterns `BEGIN' and `END', which never match any -input record. - -Here are descriptions of the three boolean operators. - -`PAT1 && PAT2' - Matches if both PAT1 and PAT2 match by themselves. For example, - the following command prints all records in the input file - `BBS-list' that contain both `2400' and `foo'. - - awk '/2400/ && /foo/' BBS-list - - Whether PAT2 matches is tested only if PAT1 succeeds. This can - make a difference when PAT2 contains expressions that have side - effects: in the case of `/foo/ && ($2 == bar++)', the variable - `bar' is not incremented if there is no `foo' in the record. - -`PAT1 || PAT2' - Matches if at least one of PAT1 and PAT2 matches the current - input record. For example, the following command prints all - records in the input file `BBS-list' that contain *either* - `2400' or `foo', or both. - - awk '/2400/ || /foo/' BBS-list - - Whether PAT2 matches is tested only if PAT1 fails to match. - This can make a difference when PAT2 contains expressions that - have side effects. - -`!PAT' - Matches if PAT does not match. For example, the following - command prints all records in the input file `BBS-list' that do - *not* contain the string `foo'. - - awk '! /foo/' BBS-list - -Note that boolean patterns are built from other patterns just as -boolean expressions are built from other expressions (*note Boolean -Ops::.). Any boolean expression is also a valid boolean pattern. -But the converse is not true: simple regular expression patterns such -as `/foo/' are not allowed in boolean expressions. Regular -expressions can appear in boolean expressions only in conjunction -with the matching operators, `~' and `!~'. - - - -File: gawk-info, Node: Conditional Patterns, Prev: Boolean, Up: Patterns - -Conditional Patterns -==================== - -Patterns may use a "conditional expression" much like the conditional -expression of the C language. This takes the form: - - PAT1 ? PAT2 : PAT3 - -The first pattern is evaluated. If it evaluates to TRUE, then the -input record is tested against PAT2. Otherwise it is tested against -PAT3. The conditional pattern matches if PAT2 or PAT3 (whichever one -is selected) matches. - - - -File: gawk-info, Node: Actions, Next: Expressions, Prev: Patterns, Up: Top - -Actions: The Basics -******************* - -The "action" part of an `awk' rule tells `awk' what to do once a -match for the pattern is found. An action consists of one or more -`awk' "statements", enclosed in curly braces (`{' and `}'). The -curly braces must be used even if the action contains only one -statement, or even if it contains no statements at all. Action -statements are separated by newlines or semicolons. - -Besides the print statements already covered (*note Printing::.), -there are four kinds of action statements: expressions, control -statements, compound statements, and function definitions. - - * "Expressions" include assignments, arithmetic, function calls, - and more (*note Expressions::.). - - * "Control statements" specify the control flow of `awk' programs. - The `awk' language gives you C--like constructs (`if', `for', - `while', and so on) as well as a few special ones (*note - Statements::.). - - * A "compound statement" is just one or more `awk' statements - enclosed in curly braces. This way you can group several - statements to form the body of an `if' or similar statement. - - * You can define "user--defined functions" for use elsewhere in - the `awk' program (*note User-defined::.). - - - -File: gawk-info, Node: Expressions, Next: Statements, Prev: Actions, Up: Top - -Actions: Expressions -******************** - -Expressions are the basic building block of `awk' actions. An -expression evaluates to a value, which you can print, test, store in -a variable or pass to a function. - -But, beyond that, an expression can assign a new value to a variable -or a field, with an assignment operator. - -An expression can serve as a statement on its own. Most other action -statements are made up of various combinations of expressions. As in -other languages, expressions in `awk' include variables, array -references, constants, and function calls, as well as combinations of -these with various operators. - -* Menu: - -* Constants:: String and numeric constants. -* Variables:: Variables give names to values for future use. -* Fields:: Field references such as `$1' are also expressions. -* Arrays:: Array element references are expressions. - -* Arithmetic Ops:: Arithmetic operations (`+', `-', etc.) -* Concatenation:: Concatenating strings. -* Comparison Ops:: Comparison of numbers and strings with `<', etc. -* Boolean Ops:: Combining comparison expressions using boolean operators - `||' (``or''), `&&' (``and'') and `!' (``not''). - -* Assignment Ops:: Changing the value of a variable or a field. -* Increment Ops:: Incrementing the numeric value of a variable. - -* Conversion:: The conversion of strings to numbers and vice versa. -* Conditional Exp:: Conditional expressions select between two subexpressions - under control of a third subexpression. -* Function Calls:: A function call is an expression. - - - -File: gawk-info, Node: Constants, Next: Variables, Up: Expressions - -Constant Expressions -==================== - -There are two types of constants: numeric constants and string -constants. - -The "numeric constant" is a number. This number can be an integer, a -decimal fraction, or a number in scientific (exponential) notation. -Note that all numeric values are represented within `awk' in -double--precision floating point. Here are some examples of numeric -constants, which all have the same value: - - 105 - 1.05e+2 - 1050e-1 - -A string constant consists of a sequence of characters enclosed in -double--quote marks. For example: - - "parrot" - -represents the string constant `parrot'. Strings in `gawk' can be of -any length and they can contain all the possible 8--bit ASCII -characters including ASCII NUL. Other `awk' implementations may have -difficulty with some character codes. - -Some characters cannot be included literally in a string. You -represent them instead with "escape sequences", which are character -sequences beginning with a backslash (`\'). - -One use of the backslash is to include double--quote characters in a -string. Since a plain double--quote would end the string, you must -use `\"'. Backslash itself is another character that can't be -included normally; you write `\\' to put one backslash in the string. - -Another use of backslash is to represent unprintable characters such -as newline. While there is nothing to stop you from writing these -characters directly in an `awk' program, they may look ugly. - -`\b' - Represents a backspaced, H'. - -`\f' - Represents a formfeed, L'. - -`\n' - Represents a newline, J'. - -`\r' - Represents a carriage return, M'. - -`\t' - Represents a horizontal tab, I'. - -`\v' - Represents a vertical tab, K'. - -`\NNN' - Represents the octal value NNN, where NNN is one to three digits - between 0 and 7. For example, the code for the ASCII ESC - (escape) character is `\033'. - - - -File: gawk-info, Node: Variables, Next: Arithmetic Ops, Prev: Constants, Up: Expressions - -Variables -========= - -Variables let you give names to values and refer to them later. You -have already seen variables in many of the examples. The name of a -variable must be a sequence of letters, digits and underscores, but -it may not begin with a digit. Case is significant in variable -names; `a' and `A' are distinct variables. - -A variable name is a valid expression by itself; it represents the -variable's current value. Variables are given new values with -"assignment operators" and "increment operators". *Note Assignment -Ops::. - -A few variables have special built--in meanings, such as `FS', the -field separator, and `NF', the number of fields in the current input -record. *Note Special::, for a list of them. Special variables can -be used and assigned just like all other variables, but their values -are also used or changed automatically by `awk'. Each special -variable's name is made entirely of upper case letters. - -Variables in `awk' can be assigned either numeric values or string -values. By default, variables are initialized to the null string, -which has the numeric value zero. So there is no need to -``initialize'' each variable explicitly in `awk', the way you would -need to do in C or most other traditional programming languages. - - - -File: gawk-info, Node: Arithmetic Ops, Next: Concatenation, Prev: Variables, Up: Expressions - -Arithmetic Operators -==================== - -The `awk' language uses the common arithmetic operators when -evaluating expressions. All of these arithmetic operators follow -normal precedence rules, and work as you would expect them to. This -example divides field 3 by field 4, adds field 2, stores the result -into field 1, and prints the results: - - awk '{ $1 = $2 + $3 / $4; print }' inventory-shipped - -The arithmetic operators in `awk' are: - -`X + Y' - Addition. - -`X - Y' - Subtraction. - -`- X' - Negation. - -`X / Y' - Division. Since all numbers in `awk' are double--precision - floating point, the result is not rounded to an integer: `3 / 4' - has the value 0.75. - -`X * Y' - Multiplication. - -`X % Y' - Remainder. The quotient is rounded toward zero to an integer, - multiplied by Y and this result is subtracted from X. This - operation is sometimes known as ``trunc--mod''. The following - relation always holds: - - `b * int(a / b) + (a % b) == a' - - One undesirable effect of this definition of remainder is that X - % Y is negative if X is negative. Thus, - - -17 % 8 = -1 - -`X ^ Y' -`X ** Y' - Exponentiation: X raised to the Y power. `2 ^ 3' has the value - 8. The character sequence `**' is equivalent to `^'. - - - -File: gawk-info, Node: Concatenation, Next: Comparison Ops, Prev: Arithmetic Ops, Up: Expressions - -String Concatenation -==================== - -There is only one string operation: concatenation. It does not have -a specific operator to represent it. Instead, concatenation is -performed by writing expressions next to one another, with no -operator. For example: - - awk '{ print "Field number one: " $1 }' BBS-list - -produces, for the first record in `BBS-list': - - Field number one: aardvark - -If you hadn't put the space after the `:', the line would have run -together. For example: - - awk '{ print "Field number one:" $1 }' BBS-list - -produces, for the first record in `BBS-list': - - Field number one:aardvark - - - -File: gawk-info, Node: Comparison Ops, Next: Boolean Ops, Prev: Concatenation, Up: Expressions - -Comparison Expressions -====================== - -"Comparison expressions" use "relational operators" to compare -strings or numbers. The relational operators are the same as in C. -Here is a table of them: - -`X < Y' - True if X is less than Y. - -`X <= Y' - True if X is less than or equal to Y. - -`X > Y' - True if X is greater than Y. - -`X >= Y' - True if X is greater than or equal to Y. - -`X == Y' - True if X is equal to Y. - -`X != Y' - True if X is not equal to Y. - -`X ~ REGEXP' - True if regexp REGEXP matches the string X. - -`X !~ REGEXP' - True if regexp REGEXP does not match the string X. - -`SUBSCRIPT in ARRAY' - True if array ARRAY has an element with the subscript SUBSCRIPT. - -Comparison expressions have the value 1 if true and 0 if false. - -The operands of a relational operator are compared as numbers if they -are both numbers. Otherwise they are converted to, and compared as, -strings (*note Conversion::.). Strings are compared by comparing the -first character of each, then the second character of each, and so on. -Thus, `"10"' is less than `"9"'. - -For example, - - $1 == "foo" - -has the value of 1, or is true, if the first field of the current -input record is precisely `foo'. By contrast, - - $1 ~ /foo/ - -has the value 1 if the first field contains `foo'. - - - -File: gawk-info, Node: Boolean Ops, Next: Assignment Ops, Prev: Comparison Ops, Up: Expressions - -Boolean Operators -================= - -A boolean expression is combination of comparison expressions or -matching expressions, using the boolean operators ``or'' (`||'), -``and'' (`&&'), and ``not'' (`!'), along with parentheses to control -nesting. The truth of the boolean expression is computed by -combining the truth values of the component expressions. - -Boolean expressions can be used wherever comparison and matching -expressions can be used. They can be used in `if' and `while' -statements. They have numeric values (1 if true, 0 if false). - -In addition, every boolean expression is also a valid boolean -pattern, so you can use it as a pattern to control the execution of -rules. - -Here are descriptions of the three boolean operators, with an example -of each. It may be instructive to compare these examples with the -analogous examples of boolean patterns (*note Boolean::.), which use -the same boolean operators in patterns instead of expressions. - -`BOOLEAN1 && BOOLEAN2' - True if both BOOLEAN1 and BOOLEAN2 are true. For example, the - following statement prints the current input record if it - contains both `2400' and `foo'. - - if ($0 ~ /2400/ && $0 ~ /foo/) print - - The subexpression BOOLEAN2 is evaluated only if BOOLEAN1 is - true. This can make a difference when BOOLEAN2 contains - expressions that have side effects: in the case of `$0 ~ /foo/ - && ($2 == bar++)', the variable `bar' is not incremented if - there is no `foo' in the record. - -`BOOLEAN1 || BOOLEAN2' - True if at least one of BOOLEAN1 and BOOLEAN2 is true. For - example, the following command prints all records in the input - file `BBS-list' that contain *either* `2400' or `foo', or both. - - awk '{ if ($0 ~ /2400/ || $0 ~ /foo/) print }' BBS-list - - The subexpression BOOLEAN2 is evaluated only if BOOLEAN1 is - true. This can make a difference when BOOLEAN2 contains - expressions that have side effects. - -`!BOOLEAN' - True if BOOLEAN is false. For example, the following program - prints all records in the input file `BBS-list' that do *not* - contain the string `foo'. - - awk '{ if (! ($0 ~ /foo/)) print }' BBS-list - - - -File: gawk-info, Node: Assignment Ops, Next: Increment Ops, Prev: Boolean Ops, Up: Expressions - -Assignment Operators -==================== - -An "assignment" is an expression that stores a new value into a -variable. For example, let's assign the value 1 to the variable `z': - - z = 1 - -After this expression is executed, the variable `z' has the value 1. -Whatever old value `z' had before the assignment is forgotten. - -The `=' sign is called an "assignment operator". It is the simplest -assignment operator because the value of the right--hand operand is -stored unchanged. - -The left--hand operand of an assignment can be a variable (*note -Variables::.), a field (*note Changing Fields::.) or an array element -(*note Arrays::.). These are all called "lvalues", which means they -can appear on the left side of an assignment operator. The -right--hand operand may be any expression; it produces the new value -which the assignment stores in the specified variable, field or array -element. - -Assignments can store string values also. For example, this would -store the value `"this food is good"' in the variable `message': - - thing = "food" - predicate = "good" - message = "this " thing " is " predicate - -(This also illustrates concatenation of strings.) - -It is important to note that variables do *not* have permanent types. -The type of a variable is simply the type of whatever value it -happens to hold at the moment. In the following program fragment, -the variable `foo' has a numeric value at first, and a string value -later on: - - foo = 1 - print foo - foo = "bar" - print foo - -When the second assignment gives `foo' a string value, the fact that -it previously had a numeric value is forgotten. - -An assignment is an expression, so it has a value: the same value -that is assigned. Thus, `z = 1' as an expression has the value 1. -One consequence of this is that you can write multiple assignments -together: - - x = y = z = 0 - -stores the value 0 in all three variables. It does this because the -value of `z = 0', which is 0, is stored into `y', and then the value -of `y = z = 0', which is 0, is stored into `x'. - -You can use an assignment anywhere an expression is called for. For -example, it is valid to write `x != (y = 1)' to set `y' to 1 and then -test whether `x' equals 1. But this style tends to make programs -hard to read; except in a one--shot program, you should rewrite it to -get rid of such nesting of assignments. This is never very hard. - -Aside from `=', there are several other assignment operators that do -arithmetic with the old value of the variable. For example, the -operator `+=' computes a new value by adding the right--hand value to -the old value of the variable. Thus, the following assignment adds 5 -to the value of `foo': - - foo += 5 - -This is precisely equivalent to the following: - - foo = foo + 5 - -Use whichever one makes the meaning of your program clearer. - -Here is a table of the arithmetic assignment operators. In each -case, the right--hand operand is an expression whose value is -converted to a number. - -`LVALUE += INCREMENT' - Adds INCREMENT to the value of LVALUE to make the new value of - LVALUE. - -`LVALUE -= DECREMENT' - Subtracts DECREMENT from the value of LVALUE. - -`LVALUE *= COEFFICIENT' - Multiplies the value of LVALUE by COEFFICIENT. - -`LVALUE /= QUOTIENT' - Divides the value of LVALUE by QUOTIENT. - -`LVALUE %= MODULUS' - Sets LVALUE to its remainder by MODULUS. - -`LVALUE ^= POWER' -`LVALUE **= POWER' - Raises LVALUE to the power POWER. - - - -File: gawk-info, Node: Increment Ops, Next: Conversion, Prev: Assignment Ops, Up: Expressions - -Increment Operators -=================== - -"Increment operators" increase or decrease the value of a variable by -1. You could do the same thing with an assignment operator, so the -increment operators add no power to the `awk' language; but they are -convenient abbreviations for something very common. - -The operator to add 1 is written `++'. There are two ways to use -this operator: pre--incrementation and post--incrementation. - -To pre--increment a variable V, write `++V'. This adds 1 to the -value of V and that new value is also the value of this expression. -The assignment expression `V += 1' is completely equivalent. - -Writing the `++' after the variable specifies post--increment. This -increments the variable value just the same; the difference is that -the value of the increment expression itself is the variable's *old* -value. Thus, if `foo' has value 4, then the expression `foo++' has -the value 4, but it changes the value of `foo' to 5. - -The post--increment `foo++' is nearly equivalent to writing `(foo += -1) - 1'. It is not perfectly equivalent because all numbers in `awk' -are floating point: in floating point, `foo + 1 - 1' does not -necessarily equal `foo'. But the difference will be minute as long -as you stick to numbers that are fairly small (less than a trillion). - -Any lvalue can be incremented. Fields and array elements are -incremented just like variables. - -The decrement operator `--' works just like `++' except that it -subtracts 1 instead of adding. Like `++', it can be used before the -lvalue to pre--decrement or after it to post--decrement. - -Here is a summary of increment and decrement expressions. - -`++LVALUE' - This expression increments LVALUE and the new value becomes the - value of this expression. - -`LVALUE++' - This expression causes the contents of LVALUE to be incremented. - The value of the expression is the *old* value of LVALUE. - -`--LVALUE' - Like `++LVALUE', but instead of adding, it subtracts. It - decrements LVALUE and delivers the value that results. - -`LVALUE--' - Like `LVALUE++', but instead of adding, it subtracts. It - decrements LVALUE. The value of the expression is the *old* - value of LVALUE. - - - -File: gawk-info, Node: Conversion, Next: Conditional Exp, Prev: Increment Ops, Up: Expressions - -Conversion of Strings and Numbers -================================= - -Strings are converted to numbers, and numbers to strings, if the -context of your `awk' statement demands it. For example, if the -values of `foo' or `bar' in the expression `foo + bar' happen to be -strings, they are converted to numbers before the addition is -performed. If numeric values appear in string concatenation, they -are converted to strings. Consider this: - - two = 2; three = 3 - print (two three) + 4 - -This eventually prints the (numeric) value `27'. The numeric -variables `two' and `three' are converted to strings and concatenated -together, and the resulting string is converted back to a number -before adding `4'. The resulting numeric value `27' is printed. - -If, for some reason, you need to force a number to be converted to a -string, concatenate the null string with that number. To force a -string to be converted to a number, add zero to that string. Strings -that can't be interpreted as valid numbers are given the numeric -value zero. - -The exact manner in which numbers are converted into strings is -controlled by the `awk' special variable `OFMT' (*note Special::.). -Numbers are converted using a special version of the `sprintf' -function (*note Built-in::.) with `OFMT' as the format specifier. - -`OFMT''s default value is `"%.6g"', which prints a value with at -least six significant digits. You might want to change it to specify -more precision, if your version of `awk' uses double precision -arithmetic. Double precision on most modern machines gives you 16 or -17 decimal digits of precision. - -Strange results can happen if you set `OFMT' to a string that doesn't -tell `sprintf' how to format floating point numbers in a useful way. -For example, if you forget the `%' in the format, all numbers will be -converted to the same constant string. - - - -File: gawk-info, Node: Conditional Exp, Next: Function Calls, Prev: Conversion, Up: Expressions - -Conditional Expressions -======================= - -A "conditional expression" is a special kind of expression with three -operands. It allows you to use one expression's value to select one -of two other expressions. - -The conditional expression looks the same as in the C language: - - SELECTOR ? IF-TRUE-EXP : IF-FALSE-EXP - -There are three subexpressions. The first, SELECTOR, is always -computed first. If it is ``true'' (not zero) then IF-TRUE-EXP is -computed next and its value becomes the value of the whole expression. -Otherwise, IF-FALSE-EXP is computed next and its value becomes the -value of the whole expression. - -For example, this expression produces the absolute value of `x': - - x > 0 ? x : -x - -Each time the conditional expression is computed, exactly one of -IF-TRUE-EXP and IF-FALSE-EXP is computed; the other is ignored. This -is important when the expressions contain side effects. For example, -this conditional expression examines element `i' of either array `a' -or array `b', and increments `i'. - - x == y ? a[i++] : b[i++] - -This is guaranteed to increment `i' exactly once, because each time -one or the other of the two increment expressions will be executed -and the other will not be. - - - -File: gawk-info, Node: Function Calls, Prev: Conditional Exp, Up: Expressions - -Function Calls -============== - -A "function" is a name for a particular calculation. Because it has -a name, you can ask for it by name at any point in the program. For -example, the function `sqrt' computes the square root of a number. - -A fixed set of functions are "built in", which means they are -available in every `awk' program. The `sqrt' function is one of -these. *Note Built-in::, for a list of built--in functions and their -descriptions. In addition, you can define your own functions in the -program for use elsewhere in the same program. *Note User-defined::, -for how to do this. - -The way to use a function is with a "function call" expression, which -consists of the function name followed by a list of "arguments" in -parentheses. The arguments are expressions which give the raw -materials for the calculation that the function will do. When there -is more than one argument, they are separated by commas. If there -are no arguments, write just `()' after the function name. - -*Do not put any space between the function name and the -open--parenthesis!* A user--defined function name looks just like -the name of a variable, and space would make the expression look like -concatenation of a variable with an expression inside parentheses. -Space before the parenthesis is harmless with built--in functions, -but it is best not to get into the habit of using space, lest you do -likewise for a user--defined function one day by mistake. - -Each function needs a particular number of arguments. For example, -the `sqrt' function must be called with a single argument, like this: - - sqrt(ARGUMENT) - -The argument is the number to take the square root of. - -Some of the built--in functions allow you to omit the final argument. -If you do so, they will use a reasonable default. *Note Built-in::, -for full details. If arguments are omitted in calls to user--defined -functions, then those arguments are treated as local variables, -initialized to the null string (*note User-defined::.). - -Like every other expression, the function call has a value, which is -computed by the function based on the arguments you give it. In this -example, the value of `sqrt(ARGUMENT)' is the square root of the -argument. A function can also have side effects, such as assigning -the values of certain variables or doing I/O. - -Here is a command to read numbers, one number per line, and print the -square root of each one: - - awk '{ print "The square root of", $1, "is", sqrt($1) }' - - - -File: gawk-info, Node: Statements, Next: Arrays, Prev: Expressions, Up: Top - -Actions: Statements -******************* - -"Control statements" such as `if', `while', and so on control the -flow of execution in `awk' programs. Most of the control statements -in `awk' are patterned on similar statements in C. - -The simplest kind of statement is an expression. The other kinds of -statements start with special keywords such as `if' and `while', to -distinguish them from simple expressions. - -In all the examples in this chapter, BODY can be either a single -statement or a group of statements. Groups of statements are -enclosed in braces, and separated by newlines or semicolons. - -* Menu: - -* Expressions:: One kind of statement simply computes an expression. - -* If:: Conditionally execute some `awk' statements. - -* While:: Loop until some condition is satisfied. - -* Do:: Do specified action while looping until some - condition is satisfied. - -* For:: Another looping statement, that provides - initialization and increment clauses. - -* Break:: Immediately exit the innermost enclosing loop. - -* Continue:: Skip to the end of the innermost enclosing loop. - -* Next:: Stop processing the current input record. - -* Exit:: Stop execution of `awk'. - - - -File: gawk-info, Node: If, Next: While, Up: Statements - -The `if' Statement -================== - -The `if'-`else' statement is `awk''s decision--making statement. The -`else' part of the statement is optional. - - `if (CONDITION) BODY1 else BODY2' - -Here CONDITION is an expression that controls what the rest of the -statement will do. If CONDITION is true, BODY1 is executed; -otherwise, BODY2 is executed (assuming that the `else' clause is -present). The condition is considered true if it is nonzero or -nonnull. - -Here is an example: - - awk '{ if (x % 2 == 0) - print "x is even" - else - print "x is odd" }' - -In this example, if the statement containing `x' is found to be true -(that is, x is divisible by 2), then the first `print' statement is -executed, otherwise the second `print' statement is performed. - -If the `else' appears on the same line as BODY1, and BODY1 is a -single statement, then a semicolon must separate BODY1 from `else'. -To illustrate this, let's rewrite the previous example: - - awk '{ if (x % 2 == 0) print "x is even"; else - print "x is odd" }' - -If you forget the `;', `awk' won't be able to parse it, and you will -get a syntax error. - -We would not actually write this example this way, because a human -reader might fail to see the `else' if it were not the first thing on -its line. - - - -File: gawk-info, Node: While, Next: Do, Prev: If, Up: Statements - -The `while' Statement -===================== - -In programming, a loop means a part of a program that is (or at least -can be) executed two or more times in succession. - -The `while' statement is the simplest looping statement in `awk'. It -repeatedly executes a statement as long as a condition is true. It -looks like this: - - while (CONDITION) - BODY - -Here BODY is a statement that we call the "body" of the loop, and -CONDITION is an expression that controls how long the loop keeps -running. - -The first thing the `while' statement does is test CONDITION. If -CONDITION is true, it executes the statement BODY. After BODY has -been executed, CONDITION is tested again and this process is repeated -until CONDITION is no longer true. If CONDITION is initially false, -the body of the loop is never executed. - - awk '{ i = 1 - while (i <= 3) { - print $i - i++ - } - }' - -This example prints the first three input fields, one per line. - -The loop works like this: first, the value of `i' is set to 1. Then, -the `while' tests whether `i' is less than or equal to three. This -is the case when `i' equals one, so the `i'-th field is printed. -Then the `i++' increments the value of `i' and the loop repeats. - -When `i' reaches 4, the loop exits. Here BODY is a compound -statement enclosed in braces. As you can see, a newline is not -required between the condition and the body; but using one makes the -program clearer unless the body is a compound statement or is very -simple. - - - -File: gawk-info, Node: Do, Next: For, Prev: While, Up: Statements - -The `do'--`while' Statement -=========================== - -The `do' loop is a variation of the `while' looping statement. The -`do' loop executes the BODY once, then repeats BODY as long as -CONDITION is true. It looks like this: - - do - BODY - while (CONDITION) - -Even if CONDITION is false at the start, BODY is executed at least -once (and only once, unless executing BODY makes CONDITION true). -Contrast this with the corresponding `while' statement: - - while (CONDITION) - BODY - -This statement will not execute BODY even once if CONDITION is false -to begin with. - -Here is an example of a `do' statement: - - awk '{ i = 1 - do { - print $0 - i++ - } while (i <= 10) - }' - -prints each input record ten times. It isn't a very realistic -example, since in this case an ordinary `while' would do just as -well. But this is normal; there is only occasionally a real use for -a `do' statement. - - |