summaryrefslogtreecommitdiff
path: root/pod/perlform.pod
diff options
context:
space:
mode:
Diffstat (limited to 'pod/perlform.pod')
-rw-r--r--pod/perlform.pod314
1 files changed, 314 insertions, 0 deletions
diff --git a/pod/perlform.pod b/pod/perlform.pod
new file mode 100644
index 0000000000..38d7153e8b
--- /dev/null
+++ b/pod/perlform.pod
@@ -0,0 +1,314 @@
+=head1 NAME
+
+perlform - Perl formats
+
+=head1 DESCRIPTION
+
+Perl has a mechanism to help you generate simple reports and charts. To
+facilitate this, Perl helps you lay out your output page in your code in a
+fashion that's close to how it will look when it's printed. It can keep
+track of things like how many lines on a page, what page you're, when to
+print page headers, etc. The keywords used are borrowed from FORTRAN:
+format() to declare and write() to execute; see their entries in
+L<manfunc>. Fortunately, the layout is much more legible, more like
+BASIC's PRINT USING statement. Think of it as a poor man's nroff(1).
+
+Formats, like packages and subroutines, are declared rather than executed,
+so they may occur at any point in your program. (Usually it's best to
+keep them all together though.) They have their own namespace apart from
+all the other "types" in Perl. This means that if you have a function
+named "Foo", it is not the same thing as having a format named "Foo".
+However, the default name for the format associated with a given
+filehandle is the same as the name of the filehandle. Thus, the default
+format for STDOUT is name "STDOUT", and the default format for filehandle
+TEMP is name "TEMP". They just look the same. They aren't.
+
+Output record formats are declared as follows:
+
+ format NAME =
+ FORMLIST
+ .
+
+If name is omitted, format "STDOUT" is defined. FORMLIST consists of a
+sequence of lines, each of which may be of one of three types:
+
+=over 4
+
+=item 1.
+
+A comment, indicated by putting a '#' in the first column.
+
+=item 2.
+
+A "picture" line giving the format for one output line.
+
+=item 3.
+
+An argument line supplying values to plug into the previous picture line.
+
+=back
+
+Picture lines are printed exactly as they look, except for certain fields
+that substitute values into the line. Each field in a picture line starts
+with either "@" (at) or "^" (caret). These lines do not undergo any kind
+of variable interpolation. The at field (not to be confused with the array
+marker @) is the normal kind of field; the other kind, caret fields, are used
+to do rudimentary multi-line text block filling. The length of the field
+is supplied by padding out the field with multiple "<", ">", or "|"
+characters to specify, respectively, left justification, right
+justification, or centering. If the variable would exceed the width
+specified, it is truncated.
+
+As an alternate form of right justification, you may also use "#"
+characters (with an optional ".") to specify a numeric field. This way
+you can line up the decimal points. If any value supplied for these
+fields contains a newline, only the text up to the newline is printed.
+Finally, the special field "@*" can be used for printing multi-line,
+non-truncated values; it should appear by itself on a line.
+
+The values are specified on the following line in the same order as
+the picture fields. The expressions providing the values should be
+separated by commas. The expressions are all evaluated in a list context
+before the line is processed, so a single list expression could produce
+multiple list elements. The expressions may be spread out to more than
+one line if enclosed in braces. If so, the opening brace must be the first
+token on the first line.
+
+Picture fields that begin with ^ rather than @ are treated specially.
+With a # field, the field is blanked out if the value is undefined. For
+other field types, the caret enables a kind of fill mode. Instead of an
+arbitrary expression, the value supplied must be a scalar variable name
+that contains a text string. Perl puts as much text as it can into the
+field, and then chops off the front of the string so that the next time
+the variable is referenced, more of the text can be printed. (Yes, this
+means that the variable itself is altered during execution of the write()
+call, and is not returned.) Normally you would use a sequence of fields
+in a vertical stack to print out a block of text. You might wish to end
+the final field with the text "...", which will appear in the output if
+the text was too long to appear in its entirety. You can change which
+characters are legal to break on by changing the variable C<$:> (that's
+$FORMAT_LINE_BREAK_CHARACTERS if you're using the English module) to a
+list of the desired characters.
+
+Since use of caret fields can produce variable length records. If the text
+to be formatted is short, you can suppress blank lines by putting a
+"~" (tilde) character anywhere in the line. The tilde will be translated
+to a space upon output. If you put a second tilde contiguous to the
+first, the line will be repeated until all the fields on the line are
+exhausted. (If you use a field of the at variety, the expression you
+supply had better not give the same value every time forever!)
+
+Top-of-form processing is by default handled by a format with the
+same name as the current filehandle with "_TOP" concatenated to it.
+It's triggered at the top of each page. See <perlfunc/write()>.
+
+Examples:
+
+ # a report on the /etc/passwd file
+ format STDOUT_TOP =
+ Passwd File
+ Name Login Office Uid Gid Home
+ ------------------------------------------------------------------
+ .
+ format STDOUT =
+ @<<<<<<<<<<<<<<<<<< @||||||| @<<<<<<@>>>> @>>>> @<<<<<<<<<<<<<<<<<
+ $name, $login, $office,$uid,$gid, $home
+ .
+
+
+ # a report from a bug report form
+ format STDOUT_TOP =
+ Bug Reports
+ @<<<<<<<<<<<<<<<<<<<<<<< @||| @>>>>>>>>>>>>>>>>>>>>>>>
+ $system, $%, $date
+ ------------------------------------------------------------------
+ .
+ format STDOUT =
+ Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
+ $subject
+ Index: @<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
+ $index, $description
+ Priority: @<<<<<<<<<< Date: @<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
+ $priority, $date, $description
+ From: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
+ $from, $description
+ Assigned to: @<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
+ $programmer, $description
+ ~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
+ $description
+ ~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
+ $description
+ ~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
+ $description
+ ~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
+ $description
+ ~ ^<<<<<<<<<<<<<<<<<<<<<<<...
+ $description
+ .
+
+It is possible to intermix print()s with write()s on the same output
+channel, but you'll have to handle $- ($FORMAT_LINES_LEFT)
+yourself.
+
+=head2 Format Variables
+
+The current format name is stored in the variable C<$~> ($FORMAT_NAME),
+and the current top of form format name is in C<$^> ($FORMAT_TOP_NAME).
+The current output page number is stored in C<$%> ($FORMAT_PAGE_NUMBER),
+and the number of lines on the page is in C<$=> ($FORMAT_LINES_PER_PAGE).
+Whether to autoflush output on this handle is stored in $<$|>
+($OUTPUT_AUTOFLUSH). The string output before each top of page (except
+the first) is stored in C<$^L> ($FORMAT_FORMFEED). These variables are
+set on a per-filehandle basis, so you'll need to select() into a different
+one to affect them:
+
+ select((select(OUTF),
+ $~ = "My_Other_Format",
+ $^ = "My_Top_Format"
+ )[0]);
+
+Pretty ugly, eh? It's a common idiom though, so don't be too surprised
+when you see it. You can at least use a temporary variable to hold
+the previous filehandle: (this is a much better approach in general,
+because not only does legibility improve, you now have intermediary
+stage in the expression to single-step the debugger through):
+
+ $ofh = select(OUTF);
+ $~ = "My_Other_Format";
+ $^ = "My_Top_Format";
+ select($ofh);
+
+If you use the English module, you can even read the variable names:
+
+ use English;
+ $ofh = select(OUTF);
+ $FORMAT_NAME = "My_Other_Format";
+ $FORMAT_TOP_NAME = "My_Top_Format";
+ select($ofh);
+
+But you still have those funny select()s. So just use the FileHandle
+module. Now, you can access these special variables using lower-case
+method names instead:
+
+ use FileHandle;
+ format_name OUTF "My_Other_Format";
+ format_top_name OUTF "My_Top_Format";
+
+Much better!
+
+=head1 NOTES
+
+Since the values line may contain arbitrary expression (for at fields,
+not caret fields), you can farm out any more sophisticated processing
+to other functions, like sprintf() or one of your own. For example:
+
+ format Ident =
+ @<<<<<<<<<<<<<<<
+ &commify($n)
+ .
+
+To get a real at or caret into the field, do this:
+
+ format Ident =
+ I have an @ here.
+ "@"
+ .
+
+To center a whole line of text, do something like this:
+
+ format Ident =
+ @|||||||||||||||||||||||||||||||||||||||||||||||
+ "Some text line"
+ .
+
+There is no builtin way to say "float this to the right hand side
+of the page, however wide it is." You have to specify where it goes.
+The truly desperate can generate their own format on the fly, based
+on the current number of columns, and then eval() it:
+
+ $format = "format STDOUT = \n";
+ . '^' . '<' x $cols . "\n";
+ . '$entry' . "\n";
+ . "\t^" . "<" x ($cols-8) . "~~\n";
+ . '$entry' . "\n";
+ . ".\n";
+ print $format if $Debugging;
+ eval $format;
+ die $@ if $@;
+
+Which would generate a format looking something like this:
+
+ format STDOUT =
+ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
+ $entry
+ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<~~
+ $entry
+ .
+
+Here's a little program that's somewhat like fmt(1):
+
+ format =
+ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ~~
+ $_
+
+ .
+
+ $/ = '';
+ while (<>) {
+ s/\s*\n\s*/ /g;
+ write;
+ }
+
+=head2 Footers
+
+While $FORMAT_TOP_NAME contains the name of the current header format,
+there is no corresponding mechanism to automatically do the same thing
+for a footer. Not knowing how big a format is going to be until you
+evaluate it is one of the major problems. It's on the TODO list.
+
+Here's one strategy: If you have a fixed-size footer, you can get footers
+by checking $FORMAT_LINES_LEFT before each write() and print the footer
+yourself if necessary.
+
+Here's another strategy; open a pipe to yourself, using C<open(MESELF, "|-")>
+(see L<perlfunc/open()>) and always write() to MESELF instead of
+STDOUT. Have your child process postprocesses its STDIN to rearrange
+headers and footers however you like. Not very convenient, but doable.
+
+=head2 Accessing Formatting Internals
+
+For low-level access to the formatting mechanism. you may use formline()
+and access C<$^A> (the $ACCUMULATOR variable) directly.
+
+For example:
+
+ $str = formline <<'END', 1,2,3;
+ @<<< @||| @>>>
+ END
+
+ print "Wow, I just stored `$^A' in the accumulator!\n";
+
+Or to make an swrite() subroutine which is to write() what sprintf()
+is to printf(), do this:
+
+ use English;
+ use Carp;
+ sub swrite {
+ croak "usage: swrite PICTURE ARGS" unless @ARG;
+ local($ACCUMULATOR);
+ formline(@ARG);
+ return $ACCUMULATOR;
+ }
+
+ $string = swrite(<<'END', 1, 2, 3);
+ Check me out
+ @<<< @||| @>>>
+ END
+ print $string;
+
+=head1 WARNING
+
+During the execution of a format, only global variables are visible,
+or dynamically-scoped ones declared with local(). Lexically scoped
+variables declared with my() are I<NOT> available, as they are not
+considered to reside in the same lexical scope as the format.