From a0d0e21ea6ea90a22318550944fe6cb09ae10cda Mon Sep 17 00:00:00 2001 From: Larry Wall Date: Mon, 17 Oct 1994 23:00:00 +0000 Subject: perl 5.000 [editor's note: this commit combines approximate 4 months of furious releases of Andy Dougherty and Larry Wall - see pod/perlhist.pod for details. Andy notes that; Alas neither my "Irwin AccuTrack" nor my DC 600A quarter-inch cartridge backup tapes from that era seem to be readable anymore. I guess 13 years exceeds the shelf life for that backup technology :-(. ] --- pod/perlform.pod | 314 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 314 insertions(+) create mode 100644 pod/perlform.pod (limited to 'pod/perlform.pod') diff --git a/pod/perlform.pod b/pod/perlform.pod new file mode 100644 index 0000000000..38d7153e8b --- /dev/null +++ b/pod/perlform.pod @@ -0,0 +1,314 @@ +=head1 NAME + +perlform - Perl formats + +=head1 DESCRIPTION + +Perl has a mechanism to help you generate simple reports and charts. To +facilitate this, Perl helps you lay out your output page in your code in a +fashion that's close to how it will look when it's printed. It can keep +track of things like how many lines on a page, what page you're, when to +print page headers, etc. The keywords used are borrowed from FORTRAN: +format() to declare and write() to execute; see their entries in +L. Fortunately, the layout is much more legible, more like +BASIC's PRINT USING statement. Think of it as a poor man's nroff(1). + +Formats, like packages and subroutines, are declared rather than executed, +so they may occur at any point in your program. (Usually it's best to +keep them all together though.) They have their own namespace apart from +all the other "types" in Perl. This means that if you have a function +named "Foo", it is not the same thing as having a format named "Foo". +However, the default name for the format associated with a given +filehandle is the same as the name of the filehandle. Thus, the default +format for STDOUT is name "STDOUT", and the default format for filehandle +TEMP is name "TEMP". They just look the same. They aren't. + +Output record formats are declared as follows: + + format NAME = + FORMLIST + . + +If name is omitted, format "STDOUT" is defined. FORMLIST consists of a +sequence of lines, each of which may be of one of three types: + +=over 4 + +=item 1. + +A comment, indicated by putting a '#' in the first column. + +=item 2. + +A "picture" line giving the format for one output line. + +=item 3. + +An argument line supplying values to plug into the previous picture line. + +=back + +Picture lines are printed exactly as they look, except for certain fields +that substitute values into the line. Each field in a picture line starts +with either "@" (at) or "^" (caret). These lines do not undergo any kind +of variable interpolation. The at field (not to be confused with the array +marker @) is the normal kind of field; the other kind, caret fields, are used +to do rudimentary multi-line text block filling. The length of the field +is supplied by padding out the field with multiple "<", ">", or "|" +characters to specify, respectively, left justification, right +justification, or centering. If the variable would exceed the width +specified, it is truncated. + +As an alternate form of right justification, you may also use "#" +characters (with an optional ".") to specify a numeric field. This way +you can line up the decimal points. If any value supplied for these +fields contains a newline, only the text up to the newline is printed. +Finally, the special field "@*" can be used for printing multi-line, +non-truncated values; it should appear by itself on a line. + +The values are specified on the following line in the same order as +the picture fields. The expressions providing the values should be +separated by commas. The expressions are all evaluated in a list context +before the line is processed, so a single list expression could produce +multiple list elements. The expressions may be spread out to more than +one line if enclosed in braces. If so, the opening brace must be the first +token on the first line. + +Picture fields that begin with ^ rather than @ are treated specially. +With a # field, the field is blanked out if the value is undefined. For +other field types, the caret enables a kind of fill mode. Instead of an +arbitrary expression, the value supplied must be a scalar variable name +that contains a text string. Perl puts as much text as it can into the +field, and then chops off the front of the string so that the next time +the variable is referenced, more of the text can be printed. (Yes, this +means that the variable itself is altered during execution of the write() +call, and is not returned.) Normally you would use a sequence of fields +in a vertical stack to print out a block of text. You might wish to end +the final field with the text "...", which will appear in the output if +the text was too long to appear in its entirety. You can change which +characters are legal to break on by changing the variable C<$:> (that's +$FORMAT_LINE_BREAK_CHARACTERS if you're using the English module) to a +list of the desired characters. + +Since use of caret fields can produce variable length records. If the text +to be formatted is short, you can suppress blank lines by putting a +"~" (tilde) character anywhere in the line. The tilde will be translated +to a space upon output. If you put a second tilde contiguous to the +first, the line will be repeated until all the fields on the line are +exhausted. (If you use a field of the at variety, the expression you +supply had better not give the same value every time forever!) + +Top-of-form processing is by default handled by a format with the +same name as the current filehandle with "_TOP" concatenated to it. +It's triggered at the top of each page. See . + +Examples: + + # a report on the /etc/passwd file + format STDOUT_TOP = + Passwd File + Name Login Office Uid Gid Home + ------------------------------------------------------------------ + . + format STDOUT = + @<<<<<<<<<<<<<<<<<< @||||||| @<<<<<<@>>>> @>>>> @<<<<<<<<<<<<<<<<< + $name, $login, $office,$uid,$gid, $home + . + + + # a report from a bug report form + format STDOUT_TOP = + Bug Reports + @<<<<<<<<<<<<<<<<<<<<<<< @||| @>>>>>>>>>>>>>>>>>>>>>>> + $system, $%, $date + ------------------------------------------------------------------ + . + format STDOUT = + Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< + $subject + Index: @<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< + $index, $description + Priority: @<<<<<<<<<< Date: @<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< + $priority, $date, $description + From: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< + $from, $description + Assigned to: @<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< + $programmer, $description + ~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< + $description + ~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< + $description + ~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< + $description + ~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< + $description + ~ ^<<<<<<<<<<<<<<<<<<<<<<<... + $description + . + +It is possible to intermix print()s with write()s on the same output +channel, but you'll have to handle $- ($FORMAT_LINES_LEFT) +yourself. + +=head2 Format Variables + +The current format name is stored in the variable C<$~> ($FORMAT_NAME), +and the current top of form format name is in C<$^> ($FORMAT_TOP_NAME). +The current output page number is stored in C<$%> ($FORMAT_PAGE_NUMBER), +and the number of lines on the page is in C<$=> ($FORMAT_LINES_PER_PAGE). +Whether to autoflush output on this handle is stored in $<$|> +($OUTPUT_AUTOFLUSH). The string output before each top of page (except +the first) is stored in C<$^L> ($FORMAT_FORMFEED). These variables are +set on a per-filehandle basis, so you'll need to select() into a different +one to affect them: + + select((select(OUTF), + $~ = "My_Other_Format", + $^ = "My_Top_Format" + )[0]); + +Pretty ugly, eh? It's a common idiom though, so don't be too surprised +when you see it. You can at least use a temporary variable to hold +the previous filehandle: (this is a much better approach in general, +because not only does legibility improve, you now have intermediary +stage in the expression to single-step the debugger through): + + $ofh = select(OUTF); + $~ = "My_Other_Format"; + $^ = "My_Top_Format"; + select($ofh); + +If you use the English module, you can even read the variable names: + + use English; + $ofh = select(OUTF); + $FORMAT_NAME = "My_Other_Format"; + $FORMAT_TOP_NAME = "My_Top_Format"; + select($ofh); + +But you still have those funny select()s. So just use the FileHandle +module. Now, you can access these special variables using lower-case +method names instead: + + use FileHandle; + format_name OUTF "My_Other_Format"; + format_top_name OUTF "My_Top_Format"; + +Much better! + +=head1 NOTES + +Since the values line may contain arbitrary expression (for at fields, +not caret fields), you can farm out any more sophisticated processing +to other functions, like sprintf() or one of your own. For example: + + format Ident = + @<<<<<<<<<<<<<<< + &commify($n) + . + +To get a real at or caret into the field, do this: + + format Ident = + I have an @ here. + "@" + . + +To center a whole line of text, do something like this: + + format Ident = + @||||||||||||||||||||||||||||||||||||||||||||||| + "Some text line" + . + +There is no builtin way to say "float this to the right hand side +of the page, however wide it is." You have to specify where it goes. +The truly desperate can generate their own format on the fly, based +on the current number of columns, and then eval() it: + + $format = "format STDOUT = \n"; + . '^' . '<' x $cols . "\n"; + . '$entry' . "\n"; + . "\t^" . "<" x ($cols-8) . "~~\n"; + . '$entry' . "\n"; + . ".\n"; + print $format if $Debugging; + eval $format; + die $@ if $@; + +Which would generate a format looking something like this: + + format STDOUT = + ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< + $entry + ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<~~ + $entry + . + +Here's a little program that's somewhat like fmt(1): + + format = + ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ~~ + $_ + + . + + $/ = ''; + while (<>) { + s/\s*\n\s*/ /g; + write; + } + +=head2 Footers + +While $FORMAT_TOP_NAME contains the name of the current header format, +there is no corresponding mechanism to automatically do the same thing +for a footer. Not knowing how big a format is going to be until you +evaluate it is one of the major problems. It's on the TODO list. + +Here's one strategy: If you have a fixed-size footer, you can get footers +by checking $FORMAT_LINES_LEFT before each write() and print the footer +yourself if necessary. + +Here's another strategy; open a pipe to yourself, using C +(see L) and always write() to MESELF instead of +STDOUT. Have your child process postprocesses its STDIN to rearrange +headers and footers however you like. Not very convenient, but doable. + +=head2 Accessing Formatting Internals + +For low-level access to the formatting mechanism. you may use formline() +and access C<$^A> (the $ACCUMULATOR variable) directly. + +For example: + + $str = formline <<'END', 1,2,3; + @<<< @||| @>>> + END + + print "Wow, I just stored `$^A' in the accumulator!\n"; + +Or to make an swrite() subroutine which is to write() what sprintf() +is to printf(), do this: + + use English; + use Carp; + sub swrite { + croak "usage: swrite PICTURE ARGS" unless @ARG; + local($ACCUMULATOR); + formline(@ARG); + return $ACCUMULATOR; + } + + $string = swrite(<<'END', 1, 2, 3); + Check me out + @<<< @||| @>>> + END + print $string; + +=head1 WARNING + +During the execution of a format, only global variables are visible, +or dynamically-scoped ones declared with local(). Lexically scoped +variables declared with my() are I available, as they are not +considered to reside in the same lexical scope as the format. -- cgit v1.2.1