diff options
Diffstat (limited to 'perl.man.1')
-rw-r--r-- | perl.man.1 | 1592 |
1 files changed, 0 insertions, 1592 deletions
diff --git a/perl.man.1 b/perl.man.1 deleted file mode 100644 index fdc606c215..0000000000 --- a/perl.man.1 +++ /dev/null @@ -1,1592 +0,0 @@ -.rn '' }` -''' $Header: perl_man.1,v 3.0.1.11 91/01/11 18:15:46 lwall Locked $ -''' -''' $Log: perl.man.1,v $ -''' Revision 3.0.1.11 91/01/11 18:15:46 lwall -''' patch42: added -0 option -''' -''' Revision 3.0.1.10 90/11/10 01:45:16 lwall -''' patch38: random cleanup -''' -''' Revision 3.0.1.9 90/10/20 02:14:24 lwall -''' patch37: fixed various typos in man page -''' -''' Revision 3.0.1.8 90/10/15 18:16:19 lwall -''' patch29: added DATA filehandle to read stuff after __END__ -''' patch29: added cmp and <=> -''' patch29: added -M, -A and -C -''' -''' Revision 3.0.1.7 90/08/09 04:24:03 lwall -''' patch19: added -x switch to extract script from input trash -''' patch19: Added -c switch to do compilation only -''' patch19: bare identifiers are now strings if no other interpretation possible -''' patch19: -s now returns size of file -''' patch19: Added __LINE__ and __FILE__ tokens -''' patch19: Added __END__ token -''' -''' Revision 3.0.1.6 90/08/03 11:14:44 lwall -''' patch19: Intermediate diffs for Randal -''' -''' Revision 3.0.1.5 90/03/27 16:14:37 lwall -''' patch16: .. now works using magical string increment -''' -''' Revision 3.0.1.4 90/03/12 16:44:33 lwall -''' patch13: (LIST,) now legal -''' patch13: improved LIST documentation -''' patch13: example of if-elsif switch was wrong -''' -''' Revision 3.0.1.3 90/02/28 17:54:32 lwall -''' patch9: @array in scalar context now returns length of array -''' patch9: in manual, example of open and ?: was backwards -''' -''' Revision 3.0.1.2 89/11/17 15:30:03 lwall -''' patch5: fixed some manual typos and indent problems -''' -''' Revision 3.0.1.1 89/11/11 04:41:22 lwall -''' patch2: explained about sh and ${1+"$@"} -''' patch2: documented that space must separate word and '' string -''' -''' Revision 3.0 89/10/18 15:21:29 lwall -''' 3.0 baseline -''' -''' -.de Sh -.br -.ne 5 -.PP -\fB\\$1\fR -.PP -.. -.de Sp -.if t .sp .5v -.if n .sp -.. -.de Ip -.br -.ie \\n(.$>=3 .ne \\$3 -.el .ne 3 -.IP "\\$1" \\$2 -.. -''' -''' Set up \*(-- to give an unbreakable dash; -''' string Tr holds user defined translation string. -''' Bell System Logo is used as a dummy character. -''' -.tr \(*W-|\(bv\*(Tr -.ie n \{\ -.ds -- \(*W- -.if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch -.if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch -.ds L" "" -.ds R" "" -.ds L' ' -.ds R' ' -'br\} -.el\{\ -.ds -- \(em\| -.tr \*(Tr -.ds L" `` -.ds R" '' -.ds L' ` -.ds R' ' -'br\} -.TH PERL 1 "\*(RP" -.UC -.SH NAME -perl \- Practical Extraction and Report Language -.SH SYNOPSIS -.B perl -[options] filename args -.SH DESCRIPTION -.I Perl -is an interpreted language optimized for scanning arbitrary text files, -extracting information from those text files, and printing reports based -on that information. -It's also a good language for many system management tasks. -The language is intended to be practical (easy to use, efficient, complete) -rather than beautiful (tiny, elegant, minimal). -It combines (in the author's opinion, anyway) some of the best features of C, -\fIsed\fR, \fIawk\fR, and \fIsh\fR, -so people familiar with those languages should have little difficulty with it. -(Language historians will also note some vestiges of \fIcsh\fR, Pascal, and -even BASIC-PLUS.) -Expression syntax corresponds quite closely to C expression syntax. -Unlike most Unix utilities, -.I perl -does not arbitrarily limit the size of your data\*(--if you've got -the memory, -.I perl -can slurp in your whole file as a single string. -Recursion is of unlimited depth. -And the hash tables used by associative arrays grow as necessary to prevent -degraded performance. -.I Perl -uses sophisticated pattern matching techniques to scan large amounts of -data very quickly. -Although optimized for scanning text, -.I perl -can also deal with binary data, and can make dbm files look like associative -arrays (where dbm is available). -Setuid -.I perl -scripts are safer than C programs -through a dataflow tracing mechanism which prevents many stupid security holes. -If you have a problem that would ordinarily use \fIsed\fR -or \fIawk\fR or \fIsh\fR, but it -exceeds their capabilities or must run a little faster, -and you don't want to write the silly thing in C, then -.I perl -may be for you. -There are also translators to turn your -.I sed -and -.I awk -scripts into -.I perl -scripts. -OK, enough hype. -.PP -Upon startup, -.I perl -looks for your script in one of the following places: -.Ip 1. 4 2 -Specified line by line via -.B \-e -switches on the command line. -.Ip 2. 4 2 -Contained in the file specified by the first filename on the command line. -(Note that systems supporting the #! notation invoke interpreters this way.) -.Ip 3. 4 2 -Passed in implicitly via standard input. -This only works if there are no filename arguments\*(--to pass -arguments to a -.I stdin -script you must explicitly specify a \- for the script name. -.PP -After locating your script, -.I perl -compiles it to an internal form. -If the script is syntactically correct, it is executed. -.Sh "Options" -Note: on first reading this section may not make much sense to you. It's here -at the front for easy reference. -.PP -A single-character option may be combined with the following option, if any. -This is particularly useful when invoking a script using the #! construct which -only allows one argument. Example: -.nf - -.ne 2 - #!/usr/bin/perl \-spi.bak # same as \-s \-p \-i.bak - .\|.\|. - -.fi -Options include: -.TP 5 -.BI \-0 digits -specifies the record separator ($/) as an octal number. -If there are no digits, the null character is the separator. -Other switches may precede or follow the digits. -For example, if you have a version of -.I find -which can print filenames terminated by the null character, you can say this: -.nf - - find . \-name '*.bak' \-print0 | perl \-n0e unlink - -.fi -The special value 00 will cause Perl to slurp files in paragraph mode. -The value 0777 will cause Perl to slurp files whole since there is no -legal character with that value. -.TP 5 -.B \-a -turns on autosplit mode when used with a -.B \-n -or -.BR \-p . -An implicit split command to the @F array -is done as the first thing inside the implicit while loop produced by -the -.B \-n -or -.BR \-p . -.nf - - perl \-ane \'print pop(@F), "\en";\' - -is equivalent to - - while (<>) { - @F = split(\' \'); - print pop(@F), "\en"; - } - -.fi -.TP 5 -.B \-c -causes -.I perl -to check the syntax of the script and then exit without executing it. -.TP 5 -.BI \-d -runs the script under the perl debugger. -See the section on Debugging. -.TP 5 -.BI \-D number -sets debugging flags. -To watch how it executes your script, use -.BR \-D14 . -(This only works if debugging is compiled into your -.IR perl .) -Another nice value is \-D1024, which lists your compiled syntax tree. -And \-D512 displays compiled regular expressions. -.TP 5 -.BI \-e " commandline" -may be used to enter one line of script. -Multiple -.B \-e -commands may be given to build up a multi-line script. -If -.B \-e -is given, -.I perl -will not look for a script filename in the argument list. -.TP 5 -.BI \-i extension -specifies that files processed by the <> construct are to be edited -in-place. -It does this by renaming the input file, opening the output file by the -same name, and selecting that output file as the default for print statements. -The extension, if supplied, is added to the name of the -old file to make a backup copy. -If no extension is supplied, no backup is made. -Saying \*(L"perl \-p \-i.bak \-e "s/foo/bar/;" .\|.\|. \*(R" is the same as using -the script: -.nf - -.ne 2 - #!/usr/bin/perl \-pi.bak - s/foo/bar/; - -which is equivalent to - -.ne 14 - #!/usr/bin/perl - while (<>) { - if ($ARGV ne $oldargv) { - rename($ARGV, $ARGV . \'.bak\'); - open(ARGVOUT, ">$ARGV"); - select(ARGVOUT); - $oldargv = $ARGV; - } - s/foo/bar/; - } - continue { - print; # this prints to original filename - } - select(STDOUT); - -.fi -except that the -.B \-i -form doesn't need to compare $ARGV to $oldargv to know when -the filename has changed. -It does, however, use ARGVOUT for the selected filehandle. -Note that -.I STDOUT -is restored as the default output filehandle after the loop. -.Sp -You can use eof to locate the end of each input file, in case you want -to append to each file, or reset line numbering (see example under eof). -.TP 5 -.BI \-I directory -may be used in conjunction with -.B \-P -to tell the C preprocessor where to look for include files. -By default /usr/include and /usr/lib/perl are searched. -.TP 5 -.B \-n -causes -.I perl -to assume the following loop around your script, which makes it iterate -over filename arguments somewhat like \*(L"sed \-n\*(R" or \fIawk\fR: -.nf - -.ne 3 - while (<>) { - .\|.\|. # your script goes here - } - -.fi -Note that the lines are not printed by default. -See -.B \-p -to have lines printed. -Here is an efficient way to delete all files older than a week: -.nf - - find . \-mtime +7 \-print | perl \-ne \'chop;unlink;\' - -.fi -This is faster than using the \-exec switch of find because you don't have to -start a process on every filename found. -.TP 5 -.B \-p -causes -.I perl -to assume the following loop around your script, which makes it iterate -over filename arguments somewhat like \fIsed\fR: -.nf - -.ne 5 - while (<>) { - .\|.\|. # your script goes here - } continue { - print; - } - -.fi -Note that the lines are printed automatically. -To suppress printing use the -.B \-n -switch. -A -.B \-p -overrides a -.B \-n -switch. -.TP 5 -.B \-P -causes your script to be run through the C preprocessor before -compilation by -.IR perl . -(Since both comments and cpp directives begin with the # character, -you should avoid starting comments with any words recognized -by the C preprocessor such as \*(L"if\*(R", \*(L"else\*(R" or \*(L"define\*(R".) -.TP 5 -.B \-s -enables some rudimentary switch parsing for switches on the command line -after the script name but before any filename arguments (or before a \-\|\-). -Any switch found there is removed from @ARGV and sets the corresponding variable in the -.I perl -script. -The following script prints \*(L"true\*(R" if and only if the script is -invoked with a \-xyz switch. -.nf - -.ne 2 - #!/usr/bin/perl \-s - if ($xyz) { print "true\en"; } - -.fi -.TP 5 -.B \-S -makes -.I perl -use the PATH environment variable to search for the script -(unless the name of the script starts with a slash). -Typically this is used to emulate #! startup on machines that don't -support #!, in the following manner: -.nf - - #!/usr/bin/perl - eval "exec /usr/bin/perl \-S $0 $*" - if $running_under_some_shell; - -.fi -The system ignores the first line and feeds the script to /bin/sh, -which proceeds to try to execute the -.I perl -script as a shell script. -The shell executes the second line as a normal shell command, and thus -starts up the -.I perl -interpreter. -On some systems $0 doesn't always contain the full pathname, -so the -.B \-S -tells -.I perl -to search for the script if necessary. -After -.I perl -locates the script, it parses the lines and ignores them because -the variable $running_under_some_shell is never true. -A better construct than $* would be ${1+"$@"}, which handles embedded spaces -and such in the filenames, but doesn't work if the script is being interpreted -by csh. -In order to start up sh rather than csh, some systems may have to replace the -#! line with a line containing just -a colon, which will be politely ignored by perl. -Other systems can't control that, and need a totally devious construct that -will work under any of csh, sh or perl, such as the following: -.nf - -.ne 3 - eval '(exit $?0)' && eval 'exec /usr/bin/perl -S $0 ${1+"$@"}' - & eval 'exec /usr/bin/perl -S $0 $argv:q' - if 0; - -.fi -.TP 5 -.B \-u -causes -.I perl -to dump core after compiling your script. -You can then take this core dump and turn it into an executable file -by using the undump program (not supplied). -This speeds startup at the expense of some disk space (which you can -minimize by stripping the executable). -(Still, a "hello world" executable comes out to about 200K on my machine.) -If you are going to run your executable as a set-id program then you -should probably compile it using taintperl rather than normal perl. -If you want to execute a portion of your script before dumping, use the -dump operator instead. -Note: availability of undump is platform specific and may not be available -for a specific port of perl. -.TP 5 -.B \-U -allows -.I perl -to do unsafe operations. -Currently the only \*(L"unsafe\*(R" operation is the unlinking of directories while -running as superuser. -.TP 5 -.B \-v -prints the version and patchlevel of your -.I perl -executable. -.TP 5 -.B \-w -prints warnings about identifiers that are mentioned only once, and scalar -variables that are used before being set. -Also warns about redefined subroutines, and references to undefined -filehandles or filehandles opened readonly that you are attempting to -write on. -Also warns you if you use == on values that don't look like numbers, and if -your subroutines recurse more than 100 deep. -.TP 5 -.BI \-x directory -tells -.I perl -that the script is embedded in a message. -Leading garbage will be discarded until the first line that starts -with #! and contains the string "perl". -Any meaningful switches on that line will be applied (but only one -group of switches, as with normal #! processing). -If a directory name is specified, Perl will switch to that directory -before running the script. -The -.B \-x -switch only controls the the disposal of leading garbage. -The script must be terminated with __END__ if there is trailing garbage -to be ignored (the script can process any or all of the trailing garbage -via the DATA filehandle if desired). -.Sh "Data Types and Objects" -.PP -.I Perl -has three data types: scalars, arrays of scalars, and -associative arrays of scalars. -Normal arrays are indexed by number, and associative arrays by string. -.PP -The interpretation of operations and values in perl sometimes -depends on the requirements -of the context around the operation or value. -There are three major contexts: string, numeric and array. -Certain operations return array values -in contexts wanting an array, and scalar values otherwise. -(If this is true of an operation it will be mentioned in the documentation -for that operation.) -Operations which return scalars don't care whether the context is looking -for a string or a number, but -scalar variables and values are interpreted as strings or numbers -as appropriate to the context. -A scalar is interpreted as TRUE in the boolean sense if it is not the null -string or 0. -Booleans returned by operators are 1 for true and 0 or \'\' (the null -string) for false. -.PP -There are actually two varieties of null string: defined and undefined. -Undefined null strings are returned when there is no real value for something, -such as when there was an error, or at end of file, or when you refer -to an uninitialized variable or element of an array. -An undefined null string may become defined the first time you access it, but -prior to that you can use the defined() operator to determine whether the -value is defined or not. -.PP -References to scalar variables always begin with \*(L'$\*(R', even when referring -to a scalar that is part of an array. -Thus: -.nf - -.ne 3 - $days \h'|2i'# a simple scalar variable - $days[28] \h'|2i'# 29th element of array @days - $days{\'Feb\'}\h'|2i'# one value from an associative array - $#days \h'|2i'# last index of array @days - -but entire arrays or array slices are denoted by \*(L'@\*(R': - - @days \h'|2i'# ($days[0], $days[1],\|.\|.\|. $days[n]) - @days[3,4,5]\h'|2i'# same as @days[3.\|.5] - @days{'a','c'}\h'|2i'# same as ($days{'a'},$days{'c'}) - -and entire associative arrays are denoted by \*(L'%\*(R': - - %days \h'|2i'# (key1, val1, key2, val2 .\|.\|.) -.fi -.PP -Any of these eight constructs may serve as an lvalue, -that is, may be assigned to. -(It also turns out that an assignment is itself an lvalue in -certain contexts\*(--see examples under s, tr and chop.) -Assignment to a scalar evaluates the righthand side in a scalar context, -while assignment to an array or array slice evaluates the righthand side -in an array context. -.PP -You may find the length of array @days by evaluating -\*(L"$#days\*(R", as in -.IR csh . -(Actually, it's not the length of the array, it's the subscript of the last element, since there is (ordinarily) a 0th element.) -Assigning to $#days changes the length of the array. -Shortening an array by this method does not actually destroy any values. -Lengthening an array that was previously shortened recovers the values that -were in those elements. -You can also gain some measure of efficiency by preextending an array that -is going to get big. -(You can also extend an array by assigning to an element that is off the -end of the array. -This differs from assigning to $#whatever in that intervening values -are set to null rather than recovered.) -You can truncate an array down to nothing by assigning the null list () to -it. -The following are exactly equivalent -.nf - - @whatever = (); - $#whatever = $[ \- 1; - -.fi -.PP -If you evaluate an array in a scalar context, it returns the length of -the array. -The following is always true: -.nf - - @whatever == $#whatever \- $[ + 1; - -.fi -.PP -Multi-dimensional arrays are not directly supported, but see the discussion -of the $; variable later for a means of emulating multiple subscripts with -an associative array. -You could also write a subroutine to turn multiple subscripts into a single -subscript. -.PP -Every data type has its own namespace. -You can, without fear of conflict, use the same name for a scalar variable, -an array, an associative array, a filehandle, a subroutine name, and/or -a label. -Since variable and array references always start with \*(L'$\*(R', \*(L'@\*(R', -or \*(L'%\*(R', the \*(L"reserved\*(R" words aren't in fact reserved -with respect to variable names. -(They ARE reserved with respect to labels and filehandles, however, which -don't have an initial special character. -Hint: you could say open(LOG,\'logfile\') rather than open(log,\'logfile\'). -Using uppercase filehandles also improves readability and protects you -from conflict with future reserved words.) -Case IS significant\*(--\*(L"FOO\*(R", \*(L"Foo\*(R" and \*(L"foo\*(R" are all -different names. -Names which start with a letter may also contain digits and underscores. -Names which do not start with a letter are limited to one character, -e.g. \*(L"$%\*(R" or \*(L"$$\*(R". -(Most of the one character names have a predefined significance to -.IR perl . -More later.) -.PP -Numeric literals are specified in any of the usual floating point or -integer formats: -.nf - -.ne 5 - 12345 - 12345.67 - .23E-10 - 0xffff # hex - 0377 # octal - -.fi -String literals are delimited by either single or double quotes. -They work much like shell quotes: -double-quoted string literals are subject to backslash and variable -substitution; single-quoted strings are not (except for \e\' and \e\e). -The usual backslash rules apply for making characters such as newline, tab, etc. -You can also embed newlines directly in your strings, i.e. they can end on -a different line than they begin. -This is nice, but if you forget your trailing quote, the error will not be -reported until -.I perl -finds another line containing the quote character, which -may be much further on in the script. -Variable substitution inside strings is limited to scalar variables, normal -array values, and array slices. -(In other words, identifiers beginning with $ or @, followed by an optional -bracketed expression as a subscript.) -The following code segment prints out \*(L"The price is $100.\*(R" -.nf - -.ne 2 - $Price = \'$100\';\h'|3.5i'# not interpreted - print "The price is $Price.\e\|n";\h'|3.5i'# interpreted - -.fi -Note that you can put curly brackets around the identifier to delimit it -from following alphanumerics. -Also note that a single quoted string must be separated from a preceding -word by a space, since single quote is a valid character in an identifier -(see Packages). -.PP -Two special literals are __LINE__ and __FILE__, which represent the current -line number and filename at that point in your program. -They may only be used as separate tokens; they will not be interpolated -into strings. -In addition, the token __END__ may be used to indicate the logical end of the -script before the actual end of file. -Any following text is ignored (but may be read via the DATA filehandle). -The two control characters ^D and ^Z are synonyms for __END__. -.PP -A word that doesn't have any other interpretation in the grammar will be -treated as if it had single quotes around it. -For this purpose, a word consists only of alphanumeric characters and underline, -and must start with an alphabetic character. -As with filehandles and labels, a bare word that consists entirely of -lowercase letters risks conflict with future reserved words, and if you -use the -.B \-w -switch, Perl will warn you about any such words. -.PP -Array values are interpolated into double-quoted strings by joining all the -elements of the array with the delimiter specified in the $" variable, -space by default. -(Since in versions of perl prior to 3.0 the @ character was not a metacharacter -in double-quoted strings, the interpolation of @array, $array[EXPR], -@array[LIST], $array{EXPR}, or @array{LIST} only happens if array is -referenced elsewhere in the program or is predefined.) -The following are equivalent: -.nf - -.ne 4 - $temp = join($",@ARGV); - system "echo $temp"; - - system "echo @ARGV"; - -.fi -Within search patterns (which also undergo double-quotish substitution) -there is a bad ambiguity: Is /$foo[bar]/ to be -interpreted as /${foo}[bar]/ (where [bar] is a character class for the -regular expression) or as /${foo[bar]}/ (where [bar] is the subscript to -array @foo)? -If @foo doesn't otherwise exist, then it's obviously a character class. -If @foo exists, perl takes a good guess about [bar], and is almost always right. -If it does guess wrong, or if you're just plain paranoid, -you can force the correct interpretation with curly brackets as above. -.PP -A line-oriented form of quoting is based on the shell here-is syntax. -Following a << you specify a string to terminate the quoted material, and all lines -following the current line down to the terminating string are the value -of the item. -The terminating string may be either an identifier (a word), or some -quoted text. -If quoted, the type of quotes you use determines the treatment of the text, -just as in regular quoting. -An unquoted identifier works like double quotes. -There must be no space between the << and the identifier. -(If you put a space it will be treated as a null identifier, which is -valid, and matches the first blank line\*(--see Merry Christmas example below.) -The terminating string must appear by itself (unquoted and with no surrounding -whitespace) on the terminating line. -.nf - - print <<EOF; # same as above -The price is $Price. -EOF - - print <<"EOF"; # same as above -The price is $Price. -EOF - - print << x 10; # null identifier is delimiter -Merry Christmas! - - print <<`EOC`; # execute commands -echo hi there -echo lo there -EOC - - print <<foo, <<bar; # you can stack them -I said foo. -foo -I said bar. -bar - -.fi -Array literals are denoted by separating individual values by commas, and -enclosing the list in parentheses: -.nf - - (LIST) - -.fi -In a context not requiring an array value, the value of the array literal -is the value of the final element, as in the C comma operator. -For example, -.nf - -.ne 4 - @foo = (\'cc\', \'\-E\', $bar); - -assigns the entire array value to array foo, but - - $foo = (\'cc\', \'\-E\', $bar); - -.fi -assigns the value of variable bar to variable foo. -Note that the value of an actual array in a scalar context is the length -of the array; the following assigns to $foo the value 3: -.nf - -.ne 2 - @foo = (\'cc\', \'\-E\', $bar); - $foo = @foo; # $foo gets 3 - -.fi -You may have an optional comma before the closing parenthesis of an -array literal, so that you can say: -.nf - - @foo = ( - 1, - 2, - 3, - ); - -.fi -When a LIST is evaluated, each element of the list is evaluated in -an array context, and the resulting array value is interpolated into LIST -just as if each individual element were a member of LIST. Thus arrays -lose their identity in a LIST\*(--the list - - (@foo,@bar,&SomeSub) - -contains all the elements of @foo followed by all the elements of @bar, -followed by all the elements returned by the subroutine named SomeSub. -.PP -A list value may also be subscripted like a normal array. -Examples: -.nf - - $time = (stat($file))[8]; # stat returns array value - $digit = ('a','b','c','d','e','f')[$digit-10]; - return (pop(@foo),pop(@foo))[0]; - -.fi -.PP -Array lists may be assigned to if and only if each element of the list -is an lvalue: -.nf - - ($a, $b, $c) = (1, 2, 3); - - ($map{\'red\'}, $map{\'blue\'}, $map{\'green\'}) = (0x00f, 0x0f0, 0xf00); - -The final element may be an array or an associative array: - - ($a, $b, @rest) = split; - local($a, $b, %rest) = @_; - -.fi -You can actually put an array anywhere in the list, but the first array -in the list will soak up all the values, and anything after it will get -a null value. -This may be useful in a local(). -.PP -An associative array literal contains pairs of values to be interpreted -as a key and a value: -.nf - -.ne 2 - # same as map assignment above - %map = ('red',0x00f,'blue',0x0f0,'green',0xf00); - -.fi -Array assignment in a scalar context returns the number of elements -produced by the expression on the right side of the assignment: -.nf - - $x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2 - -.fi -.PP -There are several other pseudo-literals that you should know about. -If a string is enclosed by backticks (grave accents), it first undergoes -variable substitution just like a double quoted string. -It is then interpreted as a command, and the output of that command -is the value of the pseudo-literal, like in a shell. -In a scalar context, a single string consisting of all the output is -returned. -In an array context, an array of values is returned, one for each line -of output. -(You can set $/ to use a different line terminator.) -The command is executed each time the pseudo-literal is evaluated. -The status value of the command is returned in $? (see Predefined Names -for the interpretation of $?). -Unlike in \f2csh\f1, no translation is done on the return -data\*(--newlines remain newlines. -Unlike in any of the shells, single quotes do not hide variable names -in the command from interpretation. -To pass a $ through to the shell you need to hide it with a backslash. -.PP -Evaluating a filehandle in angle brackets yields the next line -from that file (newline included, so it's never false until EOF, at -which time an undefined value is returned). -Ordinarily you must assign that value to a variable, -but there is one situation where an automatic assignment happens. -If (and only if) the input symbol is the only thing inside the conditional of a -.I while -loop, the value is -automatically assigned to the variable \*(L"$_\*(R". -(This may seem like an odd thing to you, but you'll use the construct -in almost every -.I perl -script you write.) -Anyway, the following lines are equivalent to each other: -.nf - -.ne 5 - while ($_ = <STDIN>) { print; } - while (<STDIN>) { print; } - for (\|;\|<STDIN>;\|) { print; } - print while $_ = <STDIN>; - print while <STDIN>; - -.fi -The filehandles -.IR STDIN , -.I STDOUT -and -.I STDERR -are predefined. -(The filehandles -.IR stdin , -.I stdout -and -.I stderr -will also work except in packages, where they would be interpreted as -local identifiers rather than global.) -Additional filehandles may be created with the -.I open -function. -.PP -If a <FILEHANDLE> is used in a context that is looking for an array, an array -consisting of all the input lines is returned, one line per array element. -It's easy to make a LARGE data space this way, so use with care. -.PP -The null filehandle <> is special and can be used to emulate the behavior of -\fIsed\fR and \fIawk\fR. -Input from <> comes either from standard input, or from each file listed on -the command line. -Here's how it works: the first time <> is evaluated, the ARGV array is checked, -and if it is null, $ARGV[0] is set to \'-\', which when opened gives you standard -input. -The ARGV array is then processed as a list of filenames. -The loop -.nf - -.ne 3 - while (<>) { - .\|.\|. # code for each line - } - -.ne 10 -is equivalent to - - unshift(@ARGV, \'\-\') \|if \|$#ARGV < $[; - while ($ARGV = shift) { - open(ARGV, $ARGV); - while (<ARGV>) { - .\|.\|. # code for each line - } - } - -.fi -except that it isn't as cumbersome to say. -It really does shift array ARGV and put the current filename into -variable ARGV. -It also uses filehandle ARGV internally. -You can modify @ARGV before the first <> as long as you leave the first -filename at the beginning of the array. -Line numbers ($.) continue as if the input was one big happy file. -(But see example under eof for how to reset line numbers on each file.) -.PP -.ne 5 -If you want to set @ARGV to your own list of files, go right ahead. -If you want to pass switches into your script, you can -put a loop on the front like this: -.nf - -.ne 10 - while ($_ = $ARGV[0], /\|^\-/\|) { - shift; - last if /\|^\-\|\-$\|/\|; - /\|^\-D\|(.*\|)/ \|&& \|($debug = $1); - /\|^\-v\|/ \|&& \|$verbose++; - .\|.\|. # other switches - } - while (<>) { - .\|.\|. # code for each line - } - -.fi -The <> symbol will return FALSE only once. -If you call it again after this it will assume you are processing another -@ARGV list, and if you haven't set @ARGV, will input from -.IR STDIN . -.PP -If the string inside the angle brackets is a reference to a scalar variable -(e.g. <$foo>), -then that variable contains the name of the filehandle to input from. -.PP -If the string inside angle brackets is not a filehandle, it is interpreted -as a filename pattern to be globbed, and either an array of filenames or the -next filename in the list is returned, depending on context. -One level of $ interpretation is done first, but you can't say <$foo> -because that's an indirect filehandle as explained in the previous -paragraph. -You could insert curly brackets to force interpretation as a -filename glob: <${foo}>. -Example: -.nf - -.ne 3 - while (<*.c>) { - chmod 0644, $_; - } - -is equivalent to - -.ne 5 - open(foo, "echo *.c | tr \-s \' \et\er\ef\' \'\e\e012\e\e012\e\e012\e\e012\'|"); - while (<foo>) { - chop; - chmod 0644, $_; - } - -.fi -In fact, it's currently implemented that way. -(Which means it will not work on filenames with spaces in them unless -you have /bin/csh on your machine.) -Of course, the shortest way to do the above is: -.nf - - chmod 0644, <*.c>; - -.fi -.Sh "Syntax" -.PP -A -.I perl -script consists of a sequence of declarations and commands. -The only things that need to be declared in -.I perl -are report formats and subroutines. -See the sections below for more information on those declarations. -All uninitialized user-created objects are assumed to -start with a null or 0 value until they -are defined by some explicit operation such as assignment. -The sequence of commands is executed just once, unlike in -.I sed -and -.I awk -scripts, where the sequence of commands is executed for each input line. -While this means that you must explicitly loop over the lines of your input file -(or files), it also means you have much more control over which files and which -lines you look at. -(Actually, I'm lying\*(--it is possible to do an implicit loop with either the -.B \-n -or -.B \-p -switch.) -.PP -A declaration can be put anywhere a command can, but has no effect on the -execution of the primary sequence of commands\(*--declarations all take effect -at compile time. -Typically all the declarations are put at the beginning or the end of the script. -.PP -.I Perl -is, for the most part, a free-form language. -(The only exception to this is format declarations, for fairly obvious reasons.) -Comments are indicated by the # character, and extend to the end of the line. -If you attempt to use /* */ C comments, it will be interpreted either as -division or pattern matching, depending on the context. -So don't do that. -.Sh "Compound statements" -In -.IR perl , -a sequence of commands may be treated as one command by enclosing it -in curly brackets. -We will call this a BLOCK. -.PP -The following compound commands may be used to control flow: -.nf - -.ne 4 - if (EXPR) BLOCK - if (EXPR) BLOCK else BLOCK - if (EXPR) BLOCK elsif (EXPR) BLOCK .\|.\|. else BLOCK - LABEL while (EXPR) BLOCK - LABEL while (EXPR) BLOCK continue BLOCK - LABEL for (EXPR; EXPR; EXPR) BLOCK - LABEL foreach VAR (ARRAY) BLOCK - LABEL BLOCK continue BLOCK - -.fi -Note that, unlike C and Pascal, these are defined in terms of BLOCKs, not -statements. -This means that the curly brackets are \fIrequired\fR\*(--no dangling statements allowed. -If you want to write conditionals without curly brackets there are several -other ways to do it. -The following all do the same thing: -.nf - -.ne 5 - if (!open(foo)) { die "Can't open $foo: $!"; } - die "Can't open $foo: $!" unless open(foo); - open(foo) || die "Can't open $foo: $!"; # foo or bust! - open(foo) ? \'hi mom\' : die "Can't open $foo: $!"; - # a bit exotic, that last one - -.fi -.PP -The -.I if -statement is straightforward. -Since BLOCKs are always bounded by curly brackets, there is never any -ambiguity about which -.I if -an -.I else -goes with. -If you use -.I unless -in place of -.IR if , -the sense of the test is reversed. -.PP -The -.I while -statement executes the block as long as the expression is true -(does not evaluate to the null string or 0). -The LABEL is optional, and if present, consists of an identifier followed by -a colon. -The LABEL identifies the loop for the loop control statements -.IR next , -.IR last , -and -.I redo -(see below). -If there is a -.I continue -BLOCK, it is always executed just before -the conditional is about to be evaluated again, similarly to the third part -of a -.I for -loop in C. -Thus it can be used to increment a loop variable, even when the loop has -been continued via the -.I next -statement (similar to the C \*(L"continue\*(R" statement). -.PP -If the word -.I while -is replaced by the word -.IR until , -the sense of the test is reversed, but the conditional is still tested before -the first iteration. -.PP -In either the -.I if -or the -.I while -statement, you may replace \*(L"(EXPR)\*(R" with a BLOCK, and the conditional -is true if the value of the last command in that block is true. -.PP -The -.I for -loop works exactly like the corresponding -.I while -loop: -.nf - -.ne 12 - for ($i = 1; $i < 10; $i++) { - .\|.\|. - } - -is the same as - - $i = 1; - while ($i < 10) { - .\|.\|. - } continue { - $i++; - } -.fi -.PP -The foreach loop iterates over a normal array value and sets the variable -VAR to be each element of the array in turn. -The variable is implicitly local to the loop, and regains its former value -upon exiting the loop. -The \*(L"foreach\*(R" keyword is actually identical to the \*(L"for\*(R" keyword, -so you can use \*(L"foreach\*(R" for readability or \*(L"for\*(R" for brevity. -If VAR is omitted, $_ is set to each value. -If ARRAY is an actual array (as opposed to an expression returning an array -value), you can modify each element of the array -by modifying VAR inside the loop. -Examples: -.nf - -.ne 5 - for (@ary) { s/foo/bar/; } - - foreach $elem (@elements) { - $elem *= 2; - } - -.ne 3 - for ((10,9,8,7,6,5,4,3,2,1,\'BOOM\')) { - print $_, "\en"; sleep(1); - } - - for (1..15) { print "Merry Christmas\en"; } - -.ne 3 - foreach $item (split(/:[\e\e\en:]*/, $ENV{\'TERMCAP\'})) { - print "Item: $item\en"; - } - -.fi -.PP -The BLOCK by itself (labeled or not) is equivalent to a loop that executes -once. -Thus you can use any of the loop control statements in it to leave or -restart the block. -The -.I continue -block is optional. -This construct is particularly nice for doing case structures. -.nf - -.ne 6 - foo: { - if (/^abc/) { $abc = 1; last foo; } - if (/^def/) { $def = 1; last foo; } - if (/^xyz/) { $xyz = 1; last foo; } - $nothing = 1; - } - -.fi -There is no official switch statement in perl, because there -are already several ways to write the equivalent. -In addition to the above, you could write -.nf - -.ne 6 - foo: { - $abc = 1, last foo if /^abc/; - $def = 1, last foo if /^def/; - $xyz = 1, last foo if /^xyz/; - $nothing = 1; - } - -or - -.ne 6 - foo: { - /^abc/ && do { $abc = 1; last foo; }; - /^def/ && do { $def = 1; last foo; }; - /^xyz/ && do { $xyz = 1; last foo; }; - $nothing = 1; - } - -or - -.ne 6 - foo: { - /^abc/ && ($abc = 1, last foo); - /^def/ && ($def = 1, last foo); - /^xyz/ && ($xyz = 1, last foo); - $nothing = 1; - } - -or even - -.ne 8 - if (/^abc/) - { $abc = 1; } - elsif (/^def/) - { $def = 1; } - elsif (/^xyz/) - { $xyz = 1; } - else - {$nothing = 1;} - -.fi -As it happens, these are all optimized internally to a switch structure, -so perl jumps directly to the desired statement, and you needn't worry -about perl executing a lot of unnecessary statements when you have a string -of 50 elsifs, as long as you are testing the same simple scalar variable -using ==, eq, or pattern matching as above. -(If you're curious as to whether the optimizer has done this for a particular -case statement, you can use the \-D1024 switch to list the syntax tree -before execution.) -.Sh "Simple statements" -The only kind of simple statement is an expression evaluated for its side -effects. -Every expression (simple statement) must be terminated with a semicolon. -Note that this is like C, but unlike Pascal (and -.IR awk ). -.PP -Any simple statement may optionally be followed by a -single modifier, just before the terminating semicolon. -The possible modifiers are: -.nf - -.ne 4 - if EXPR - unless EXPR - while EXPR - until EXPR - -.fi -The -.I if -and -.I unless -modifiers have the expected semantics. -The -.I while -and -.I until -modifiers also have the expected semantics (conditional evaluated first), -except when applied to a do-BLOCK command, -in which case the block executes once before the conditional is evaluated. -This is so that you can write loops like: -.nf - -.ne 4 - do { - $_ = <STDIN>; - .\|.\|. - } until $_ \|eq \|".\|\e\|n"; - -.fi -(See the -.I do -operator below. Note also that the loop control commands described later will -NOT work in this construct, since modifiers don't take loop labels. -Sorry.) -.Sh "Expressions" -Since -.I perl -expressions work almost exactly like C expressions, only the differences -will be mentioned here. -.PP -Here's what -.I perl -has that C doesn't: -.Ip ** 8 2 -The exponentiation operator. -.Ip **= 8 -The exponentiation assignment operator. -.Ip (\|) 8 3 -The null list, used to initialize an array to null. -.Ip . 8 -Concatenation of two strings. -.Ip .= 8 -The concatenation assignment operator. -.Ip eq 8 -String equality (== is numeric equality). -For a mnemonic just think of \*(L"eq\*(R" as a string. -(If you are used to the -.I awk -behavior of using == for either string or numeric equality -based on the current form of the comparands, beware! -You must be explicit here.) -.Ip ne 8 -String inequality (!= is numeric inequality). -.Ip lt 8 -String less than. -.Ip gt 8 -String greater than. -.Ip le 8 -String less than or equal. -.Ip ge 8 -String greater than or equal. -.Ip cmp 8 -String comparison, returning -1, 0, or 1. -.Ip <=> 8 -Numeric comparison, returning -1, 0, or 1. -.Ip =~ 8 2 -Certain operations search or modify the string \*(L"$_\*(R" by default. -This operator makes that kind of operation work on some other string. -The right argument is a search pattern, substitution, or translation. -The left argument is what is supposed to be searched, substituted, or -translated instead of the default \*(L"$_\*(R". -The return value indicates the success of the operation. -(If the right argument is an expression other than a search pattern, -substitution, or translation, it is interpreted as a search pattern -at run time. -This is less efficient than an explicit search, since the pattern must -be compiled every time the expression is evaluated.) -The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else. -.Ip !~ 8 -Just like =~ except the return value is negated. -.Ip x 8 -The repetition operator. -Returns a string consisting of the left operand repeated the -number of times specified by the right operand. -.nf - - print \'\-\' x 80; # print row of dashes - print \'\-\' x80; # illegal, x80 is identifier - - print "\et" x ($tab/8), \' \' x ($tab%8); # tab over - -.fi -.Ip x= 8 -The repetition assignment operator. -.Ip .\|. 8 -The range operator, which is really two different operators depending -on the context. -In an array context, returns an array of values counting (by ones) -from the left value to the right value. -This is useful for writing \*(L"for (1..10)\*(R" loops and for doing -slice operations on arrays. -.Sp -In a scalar context, .\|. returns a boolean value. -The operator is bistable, like a flip-flop.. -Each .\|. operator maintains its own boolean state. -It is false as long as its left operand is false. -Once the left operand is true, the range operator stays true -until the right operand is true, -AFTER which the range operator becomes false again. -(It doesn't become false till the next time the range operator is evaluated. -It can become false on the same evaluation it became true, but it still returns -true once.) -The right operand is not evaluated while the operator is in the \*(L"false\*(R" state, -and the left operand is not evaluated while the operator is in the \*(L"true\*(R" state. -The scalar .\|. operator is primarily intended for doing line number ranges -after -the fashion of \fIsed\fR or \fIawk\fR. -The precedence is a little lower than || and &&. -The value returned is either the null string for false, or a sequence number -(beginning with 1) for true. -The sequence number is reset for each range encountered. -The final sequence number in a range has the string \'E0\' appended to it, which -doesn't affect its numeric value, but gives you something to search for if you -want to exclude the endpoint. -You can exclude the beginning point by waiting for the sequence number to be -greater than 1. -If either operand of scalar .\|. is static, that operand is implicitly compared -to the $. variable, the current line number. -Examples: -.nf - -.ne 6 -As a scalar operator: - if (101 .\|. 200) { print; } # print 2nd hundred lines - - next line if (1 .\|. /^$/); # skip header lines - - s/^/> / if (/^$/ .\|. eof()); # quote body - -.ne 4 -As an array operator: - for (101 .\|. 200) { print; } # print $_ 100 times - - @foo = @foo[$[ .\|. $#foo]; # an expensive no-op - @foo = @foo[$#foo-4 .\|. $#foo]; # slice last 5 items - -.fi -.Ip \-x 8 -A file test. -This unary operator takes one argument, either a filename or a filehandle, -and tests the associated file to see if something is true about it. -If the argument is omitted, tests $_, except for \-t, which tests -.IR STDIN . -It returns 1 for true and \'\' for false, or the undefined value if the -file doesn't exist. -Precedence is higher than logical and relational operators, but lower than -arithmetic operators. -The operator may be any of: -.nf - \-r File is readable by effective uid. - \-w File is writable by effective uid. - \-x File is executable by effective uid. - \-o File is owned by effective uid. - \-R File is readable by real uid. - \-W File is writable by real uid. - \-X File is executable by real uid. - \-O File is owned by real uid. - \-e File exists. - \-z File has zero size. - \-s File has non-zero size (returns size). - \-f File is a plain file. - \-d File is a directory. - \-l File is a symbolic link. - \-p File is a named pipe (FIFO). - \-S File is a socket. - \-b File is a block special file. - \-c File is a character special file. - \-u File has setuid bit set. - \-g File has setgid bit set. - \-k File has sticky bit set. - \-t Filehandle is opened to a tty. - \-T File is a text file. - \-B File is a binary file (opposite of \-T). - \-M Age of file in days when script started. - \-A Same for access time. - \-C Same for inode change time. - -.fi -The interpretation of the file permission operators \-r, \-R, \-w, \-W, \-x and \-X -is based solely on the mode of the file and the uids and gids of the user. -There may be other reasons you can't actually read, write or execute the file. -Also note that, for the superuser, \-r, \-R, \-w and \-W always return 1, and -\-x and \-X return 1 if any execute bit is set in the mode. -Scripts run by the superuser may thus need to do a stat() in order to determine -the actual mode of the file, or temporarily set the uid to something else. -.Sp -Example: -.nf -.ne 7 - - while (<>) { - chop; - next unless \-f $_; # ignore specials - .\|.\|. - } - -.fi -Note that \-s/a/b/ does not do a negated substitution. -Saying \-exp($foo) still works as expected, however\*(--only single letters -following a minus are interpreted as file tests. -.Sp -The \-T and \-B switches work as follows. -The first block or so of the file is examined for odd characters such as -strange control codes or metacharacters. -If too many odd characters (>10%) are found, it's a \-B file, otherwise it's a \-T file. -Also, any file containing null in the first block is considered a binary file. -If \-T or \-B is used on a filehandle, the current stdio buffer is examined -rather than the first block. -Both \-T and \-B return TRUE on a null file, or a file at EOF when testing -a filehandle. -.PP -If any of the file tests (or either stat operator) are given the special -filehandle consisting of a solitary underline, then the stat structure -of the previous file test (or stat operator) is used, saving a system -call. -(This doesn't work with \-t, and you need to remember that lstat and -l -will leave values in the stat structure for the symbolic link, not the -real file.) -Example: -.nf - - print "Can do.\en" if -r $a || -w _ || -x _; - -.ne 9 - stat($filename); - print "Readable\en" if -r _; - print "Writable\en" if -w _; - print "Executable\en" if -x _; - print "Setuid\en" if -u _; - print "Setgid\en" if -g _; - print "Sticky\en" if -k _; - print "Text\en" if -T _; - print "Binary\en" if -B _; - -.fi -.PP -Here is what C has that -.I perl -doesn't: -.Ip "unary &" 12 -Address-of operator. -.Ip "unary *" 12 -Dereference-address operator. -.Ip "(TYPE)" 12 -Type casting operator. -.PP -Like C, -.I perl -does a certain amount of expression evaluation at compile time, whenever -it determines that all of the arguments to an operator are static and have -no side effects. -In particular, string concatenation happens at compile time between literals that don't do variable substitution. -Backslash interpretation also happens at compile time. -You can say -.nf - -.ne 2 - \'Now is the time for all\' . "\|\e\|n" . - \'good men to come to.\' - -.fi -and this all reduces to one string internally. -.PP -The autoincrement operator has a little extra built-in magic to it. -If you increment a variable that is numeric, or that has ever been used in -a numeric context, you get a normal increment. -If, however, the variable has only been used in string contexts since it -was set, and has a value that is not null and matches the -pattern /^[a\-zA\-Z]*[0\-9]*$/, the increment is done -as a string, preserving each character within its range, with carry: -.nf - - print ++($foo = \'99\'); # prints \*(L'100\*(R' - print ++($foo = \'a0\'); # prints \*(L'a1\*(R' - print ++($foo = \'Az\'); # prints \*(L'Ba\*(R' - print ++($foo = \'zz\'); # prints \*(L'aaa\*(R' - -.fi -The autodecrement is not magical. -.PP -The range operator (in an array context) makes use of the magical -autoincrement algorithm if the minimum and maximum are strings. -You can say - - @alphabet = (\'A\' .. \'Z\'); - -to get all the letters of the alphabet, or - - $hexdigit = (0 .. 9, \'a\' .. \'f\')[$num & 15]; - -to get a hexadecimal digit, or - - @z2 = (\'01\' .. \'31\'); print @z2[$mday]; - -to get dates with leading zeros. -(If the final value specified is not in the sequence that the magical increment -would produce, the sequence goes until the next value would be longer than -the final value specified.) -.PP -The || and && operators differ from C's in that, rather than returning 0 or 1, -they return the last value evaluated. -Thus, a portable way to find out the home directory might be: -.nf - - $home = $ENV{'HOME'} || $ENV{'LOGDIR'} || - (getpwuid($<))[7] || die "You're homeless!\en"; - -.fi |