diff options
author | Alain Magloire <alainm@rcsm.ee.mcgill.ca> | 1998-11-22 06:46:27 +0000 |
---|---|---|
committer | Alain Magloire <alainm@rcsm.ee.mcgill.ca> | 1998-11-22 06:46:27 +0000 |
commit | 695fbff68e97129065ec5060975cfd630c54248c (patch) | |
tree | 8a3b3b92026353d3e855adf5fd57906e85a7bd4f /doc | |
parent | a0216ebb9728c8de4def988fdb3b498db2cfe11a (diff) | |
download | grep-695fbff68e97129065ec5060975cfd630c54248c.tar.gz |
auto generated no need or them in the repository.
new directory doc enter.
support for DOS.
move in doc.
updated.
* doc/: New directory, grep.1, {e,f}grep.man move here
with a draft of grep.texi(base of sed.texi).
* tests/{ere,bre}.*: New files. The spencer2 test is split
in two ere/bre.
* config.hin: New, config.h.in rename to config.hin for OS
with limited file system aka DOS.
Diffstat (limited to 'doc')
-rw-r--r-- | doc/Makefile.am | 18 | ||||
-rw-r--r-- | doc/egrep.man | 1 | ||||
-rw-r--r-- | doc/fgrep.man | 1 | ||||
-rw-r--r-- | doc/grep.1 | 477 | ||||
-rw-r--r-- | doc/grep.texi | 765 |
5 files changed, 1262 insertions, 0 deletions
diff --git a/doc/Makefile.am b/doc/Makefile.am new file mode 100644 index 00000000..b93d96c3 --- /dev/null +++ b/doc/Makefile.am @@ -0,0 +1,18 @@ +## Process this file with automake to create Makefile.in +AUTOMAKE_OPTIONS=no-dependencies + +info_TEXINFOS = grep.texi + +man_MANS = grep.1 fgrep.1 egrep.1 + +EXTRA_DIST = grep.1 egrep.man fgrep.man + +CLEANFILES = egrep.1 fgrep.1 + +fgrep.1: fgrep.man + inst=`echo "grep" | sed '$(transform)'`.1; \ +sed -e "s%@grep@%$$inst%g" $(srcdir)/fgrep.man > $@ + +egrep.1: egrep.man + inst=`echo "grep" | sed '$(transform)'`.1; \ +sed -e "s%@grep@%$$inst%g" $(srcdir)/egrep.man > $@ diff --git a/doc/egrep.man b/doc/egrep.man new file mode 100644 index 00000000..877a8a89 --- /dev/null +++ b/doc/egrep.man @@ -0,0 +1 @@ +.so man1/@grep@ diff --git a/doc/fgrep.man b/doc/fgrep.man new file mode 100644 index 00000000..877a8a89 --- /dev/null +++ b/doc/fgrep.man @@ -0,0 +1 @@ +.so man1/@grep@ diff --git a/doc/grep.1 b/doc/grep.1 new file mode 100644 index 00000000..3b957a0c --- /dev/null +++ b/doc/grep.1 @@ -0,0 +1,477 @@ +.\" grep man page +.de Id +.ds Dt \\$4 +.. +.Id $Id: grep.1,v 1.1 1998/11/22 06:45:20 alainm Exp $ +.TH GREP 1 \*(Dt "GNU Project" +.SH NAME +grep, egrep, fgrep \- print lines matching a pattern +.SH SYNOPSIS +.B grep +[-[AB] NUM] [-CEFGVabchiLlnqrsvwxyUu] [-e PATTERN | -f FILE] +[-d ACTION] [--directories=ACTION] +[--extended-regexp] [--fixed-strings] [--basic-regexp] +[--regexp=PATTERN] [--file=FILE] [--ignore-case] [--word-regexp] +[--line-regexp] [--line-regexp] [--no-messages] [--revert-match] +[--version] [--help] [--byte-offset] [--line-number] +[--with-filename] [--no-filename] [--quiet] [--silent] [--text] +[--files-without-match] [--files-with-matcces] [--count] +[--before-context=NUM] [--after-context=NUM] [--context] +[--binary] [--unix-byte-offsets] [--recursive] +.I files... +.SH DESCRIPTION +.PP +.B Grep +searches the named input +.I files +(or standard input if no files are named, or +the file name +.B \- +is given) +for lines containing a match to the given +.IR pattern . +By default, +.B grep +prints the matching lines. +.PP +There are three major variants of +.BR grep , +controlled by the following options. +.PD 0 +.TP +.B \-G, --basic-regexp +Interpret +.I pattern +as a basic regular expression (see below). This is the default. +.TP +.B \-E, --extended-regexp +Interpret +.I pattern +as an extended regular expression (see below). +.TP +.B \-F, --fixed-strings +Interpret +.I pattern +as a list of fixed strings, separated by newlines, +any of which is to be matched. +.LP +In addition, two variant programs +.B egrep +and +.B fgrep +are available. +.B Egrep +is similar (but not identical) to +.BR "grep\ \-E" , +and is compatible with the historical Unix +.BR egrep . +.B Fgrep +is the same as +.BR "grep\ \-F" . +.PD +.LP +All variants of +.B grep +understand the following options: +.PD 0 +.TP +.BI \-A " NUM" ", --after-context=" NUM +Print +.I NUM +lines of trailing context after matching lines. +.TP +.BI \-B " NUM" ", --before-context=" NUM +Print +.I NUM +lines of leading context before matching lines. +.TP +.BI \-C ,\ --context"[=NUM]" +Print +.I NUM +lines (default 2) of output context. +.TP +.BI \- NUM \ +Same as --context=NUM lines of leading and trailing context. However, +.B grep +will never print any given line more than once. +.TP +.B \-V, --version +Print the version number of +.B grep +to standard error. This version number should +be included in all bug reports (see below). +.TP +.B \-b, --byte-offset +Print the byte offset within the input file before +each line of output. +.TP +.B \-c, --count +Suppress normal output; instead print a count of +matching lines for each input file. +With the +.B \-v, --revert-match +option (see below), count non-matching lines. +.TP +.BI \-d " ACTION" ", --directories=" ACTION +If an input file is a directory, use +.I ACTION +to process it. By default, +.I ACTION +is +.BR read , +which means that directories are read just as if they were ordinary files. +If +.I ACTION +is +.BR skip , +directories are silently skipped. +If +.I ACTION +is +.BR recurse , +.B +grep reads all files under each directory, recursively; +this is equivalent to the +.B \-r +option. +.TP +.BI \-e " PATTERN" ", --regexp=" PATTERN +Use +.I PATTERN +as the pattern; useful to protect patterns beginning with +.BR \- . +.TP +.BI \-f " FILE" ", --file=" FILE +Obtain patterns from +.IR FILE , +one per line. +The empty file contains zero patterns, and therfore matches nothing. +.TP +.B \-h, --no-filename +Suppress the prefixing of filenames on output +when multiple files are searched. +.TP +.B \-i, --ignore-case +Ignore case distinctions in both the +.I pattern +and the input files. +.TP +.B \-L, --files-without-match +Suppress normal output; instead print the name +of each input file from which no output would +normally have been printed. The scanning will stop +on the first match. +.TP +.B \-l, --files-with-matches +Suppress normal output; instead print +the name of each input file from which output +would normally have been printed. The scanning will +stop on the first match. +.TP +.B \-n, --line-number +Prefix each line of output with the line number +within its input file. +.TP +.B \-q, --quiet, --silent +Quiet; suppress normal output. The scanning will stop +on the first match. +Also see the +.B \-s +or +.B --no-messages +option below. +.TP +.B \-r, --recursive +Read all files under each directory, recursively; +this is equivalent to the +.B "\-d recurse" +option. +.TP +.B \-s, --no-messages +Suppress error messages about nonexistent or unreadable files. +Portability note: unlike GNU +.BR grep , +BSD +.B grep +does not comply with POSIX.2, because BSD +.B grep +lacks a +.B \-q +option and its +.B \-s +option behaves like GNU +.BR grep 's +.B \-q +option. +Shell scripts intended to be portable to BSD +.B grep +should avoid both +.B \-q +and +.B \-s +and should redirect output to /dev/null instead. +.TP +.B \-a, --text +Do not suppress output lines that contain binary data. +Normally, if the first few bytes of a file indicate that +the file contains binary data, +.B grep +outputs only a message saying that the file matches the pattern. +This option causes +.B grep +to act as if the file is a text file, +even if it would otherwise be treated as binary. +.TP +.B \-v, --revert-match +Invert the sense of matching, to select non-matching lines. +.TP +.B \-w, --word-regexp +Select only those lines containing matches that form whole words. +The test is that the matching substring must either be at the +beginning of the line, or preceded by a non-word constituent +character. Similarly, it must be either at the end of the line +or followed by a non-word constituent character. Word-constituent +characters are letters, digits, and the underscore. +.TP +.B \-x, --line-regexp +Select only those matches that exactly match the whole line. +.TP +.B \-y +Obsolete synonym for +.BR \-i . +.TP +.B \-U, --binary +Treat the file(s) as binary. By default, under MS-DOS and MS-Windows, +.BR grep +guesses the file type by looking at the contents of the first 32KB +read from the file. If +.BR grep +decides the file is a text file, it strips the CR characters from the +original file contents (to make regular expressions with +.B ^ +and +.B $ +work correctly). Specifying +.B \-U +overrules this guesswork, causing all files to be read and passed to the +matching mechanism verbatim; if the file is a text file with CR/LF +pairs at the end of each line, this will cause some regular +expressions to fail. This option is only supported on MS-DOS and +MS-Windows. +.TP +.B \-u, --unix-byte-offsets +Report Unix-style byte offsets. This switch causes +.B grep +to report byte offsets as if the file were Unix-style text file, i.e. with +CR characters stripped off. This will produce results identical to running +.B grep +on a Unix machine. This option has no effect unless +.B \-b +option is also used; it is only supported on MS-DOS and MS-Windows. +.PD +.SH "REGULAR EXPRESSIONS" +.PP +A regular expression is a pattern that describes a set of strings. +Regular expressions are constructed analogously to arithmetic +expressions, by using various operators to combine smaller expressions. +.PP +.B Grep +understands two different versions of regular expression syntax: +``basic'' and ``extended.'' In +.RB "GNU\ " grep , +there is no difference in available functionality using either syntax. +In other implementations, basic regular expressions are less powerful. +The following description applies to extended regular expressions; +differences for basic regular expressions are summarized afterwards. +.PP +The fundamental building blocks are the regular expressions that match +a single character. Most characters, including all letters and digits, +are regular expressions that match themselves. Any metacharacter with +special meaning may be quoted by preceding it with a backslash. +.PP +A list of characters enclosed by +.B [ +and +.B ] +matches any single +character in that list; if the first character of the list +is the caret +.B ^ +then it matches any character +.I not +in the list. +For example, the regular expression +.B [0123456789] +matches any single digit. A range of ASCII characters +may be specified by giving the first and last characters, separated +by a hyphen. +Finally, certain named classes of characters are predefined. +Their names are self explanatory, and they are +.BR [:alnum:] , +.BR [:alpha:] , +.BR [:cntrl:] , +.BR [:digit:] , +.BR [:graph:] , +.BR [:lower:] , +.BR [:print:] , +.BR [:punct:] , +.BR [:space:] , +.BR [:upper:] , +and +.BR [:xdigit:]. +For example, +.B [[:alnum:]] +means +.BR [0-9A-Za-z] , +except the latter form is dependent upon the ASCII character encoding, +whereas the former is portable. +(Note that the brackets in these class names are part of the symbolic +names, and must be included in addition to the brackets delimiting +the bracket list.) Most metacharacters lose their special meaning +inside lists. To include a literal +.B ] +place it first in the list. Similarly, to include a literal +.B ^ +place it anywhere but first. Finally, to include a literal +.B \- +place it last. +.PP +The period +.B . +matches any single character. +The symbol +.B \ew +is a synonym for +.B [[:alnum:]] +and +.B \eW +is a synonym for +.BR [^[:alnum]] . +.PP +The caret +.B ^ +and the dollar sign +.B $ +are metacharacters that respectively match the empty string at the +beginning and end of a line. +The symbols +.B \e< +and +.B \e> +respectively match the empty string at the beginning and end of a word. +The symbol +.B \eb +matches the empty string at the edge of a word, +and +.B \eB +matches the empty string provided it's +.I not +at the edge of a word. +.PP +A regular expression may be followed by one of several repetition operators: +.PD 0 +.TP +.B ? +The preceding item is optional and matched at most once. +.TP +.B * +The preceding item will be matched zero or more times. +.TP +.B + +The preceding item will be matched one or more times. +.TP +.BI { n } +The preceding item is matched exactly +.I n +times. +.TP +.BI { n ,} +The preceding item is matched +.I n +or more times. +.TP +.BI {, m } +The preceding item is optional and is matched at most +.I m +times. +.TP +.BI { n , m } +The preceding item is matched at least +.I n +times, but not more than +.I m +times. +.PD +.PP +Two regular expressions may be concatenated; the resulting +regular expression matches any string formed by concatenating +two substrings that respectively match the concatenated +subexpressions. +.PP +Two regular expressions may be joined by the infix operator +.BR | ; +the resulting regular expression matches any string matching +either subexpression. +.PP +Repetition takes precedence over concatenation, which in turn +takes precedence over alternation. A whole subexpression may be +enclosed in parentheses to override these precedence rules. +.PP +The backreference +.BI \e n\c +\&, where +.I n +is a single digit, matches the substring +previously matched by the +.IR n th +parenthesized subexpression of the regular expression. +.PP +In basic regular expressions the metacharacters +.BR ? , +.BR + , +.BR { , +.BR | , +.BR ( , +and +.BR ) +lose their special meaning; instead use the backslashed +versions +.BR \e? , +.BR \e+ , +.BR \e{ , +.BR \e| , +.BR \e( , +and +.BR \e) . +.PP +In +.B egrep +the metacharacter +.B { +loses its special meaning; instead use +.BR \e{ . +.SH DIAGNOSTICS +.PP +Normally, exit status is 0 if matches were found, +and 1 if no matches were found. (The +.B \-v +option inverts the sense of the exit status.) +Exit status is 2 if there were syntax errors +in the pattern, inaccessible input files, or +other system errors. +.SH BUGS +.PP +Email bug reports to +.BR bug-gnu-utils@gnu.org . +Be sure to include the word ``grep'' somewhere in the ``Subject:'' field. +.PP +Large repetition counts in the +.BI { m , n } +construct may cause grep to use lots of memory. +In addition, +certain other obscure regular expressions require exponential time +and space, and may cause +.B grep +to run out of memory. +.PP +Backreferences are very slow, and may require exponential time. diff --git a/doc/grep.texi b/doc/grep.texi new file mode 100644 index 00000000..40af5c51 --- /dev/null +++ b/doc/grep.texi @@ -0,0 +1,765 @@ +\input texinfo @c -*-texinfo-*- +@c %**start of header +@setfilename sed.info +@settitle sed, a stream editor +@c %**end of header + +@c This file has the new style title page commands. +@c Run `makeinfo' rather than `texinfo-format-buffer'. + +@c smallbook + +@c tex +@c \overfullrule=0pt +@c end tex + +@include version.texi + +@c Combine indices. +@syncodeindex ky cp +@syncodeindex pg cp +@syncodeindex tp cp + +@defcodeindex op +@syncodeindex op fn + +@ifinfo +@direntry +* sed: (sed). Stream EDitor. +@end direntry +This file documents @sc{sed}, a stream editor. + + +Published by the Free Software Foundation, +59 Temple Place - Suite 330 +Boston, MA 02111-1307, USA + +Copyright (C) 1998 Free Software Foundation, Inc. + +Permission is granted to make and distribute verbatim copies of +this manual provided the copyright notice and this permission notice +are preserved on all copies. + +@ignore +Permission is granted to process this file through TeX and print the +results, provided the printed document carries copying permission +notice identical to this one except for the removal of this paragraph +(this paragraph not being relevant to the printed manual). + +@end ignore +Permission is granted to copy and distribute modified versions of this +manual under the conditions for verbatim copying, provided that the entire +resulting derived work is distributed under the terms of a permission +notice identical to this one. + +Permission is granted to copy and distribute translations of this manual +into another language, under the above conditions for modified versions, +except that this permission notice may be stated in a translation approved +by the Foundation. +@end ifinfo + +@setchapternewpage off + +@titlepage +@title sed, a stream editor +@subtitle version @value{VERSION}, @value{UPDATED} +@author by Ken Pizzini + +@page +@vskip 0pt plus 1filll +Copyright @copyright{} 1998 Free Software Foundation, Inc. + +@sp 2 +Published by the Free Software Foundation, @* +59 Temple Place - Suite 330, @* +Boston, MA 02111-1307, USA + +Permission is granted to make and distribute verbatim copies of +this manual provided the copyright notice and this permission notice +are preserved on all copies. + +Permission is granted to copy and distribute modified versions of this +manual under the conditions for verbatim copying, provided that the entire +resulting derived work is distributed under the terms of a permission +notice identical to this one. + +Permission is granted to copy and distribute translations of this manual +into another language, under the above conditions for modified versions, +except that this permission notice may be stated in a translation approved +by the Foundation. + +@end titlepage +@page + + +@node Top, Introduction, (dir), (dir) +@comment node-name, next, previous, up + +@ifinfo +This document was produced for version @value{VERSION} of @sc{GNU} @sc{sed}. +@end ifinfo + +@menu +* Introduction:: Introduction +* Invoking SED:: Invocation +* sed Programs:: @sc{sed} programs +* Examples:: Some sample scripts +* Limitations:: About the (non-)limitations on line length +* Other Resources:: Other resources for learning about @sc{sed} +* Reporting Bugs:: Reporting bugs +* Concept Index:: A menu with all the topics in this manual. +* Command and Option Index:: A menu with all @sc{sed} commands and + command-line options. +@end menu + + +@node Introduction, Invoking SED, Top, Top +@chapter Introduction + +@cindex Stream editor +@sc{sed} is a stream editor. +A stream editor is used to perform basic text +transformations on an input stream +(a file or input from a pipeline). +While in some ways similar to an editor which +permits scripted edits (such as @sc{ed}), +@sc{sed} works by making only one pass over the +input(s), and is consequently more efficient. +But it is @sc{sed}'s ability to filter text in a pipeline +which particularly distinguishes it from other types of +editors. + + +@node Invoking SED, sed Programs, Introduction, Top +@chapter Invocation + +@sc{sed} may be invoked with the following command-line options: + +@table @samp +@item -V +@itemx --version +@opindex -V +@opindex --version +@cindex Version, printing +Print out the version of @sc{sed} that is being run and a copyright notice, +then exit. + +@item -h +@itemx --help +@opindex -h +@opindex --help +@cindex Usage summary, printing +Print a usage message briefly summarizing these command-line options +and the bug-reporting address, +then exit. + +@item -n +@itemx --quiet +@itemx --silent +@opindex -n +@opindex --quiet +@opindex --silent +By default, @sc{sed} will print out the pattern space +at the end of each cycle through the script. +These options disable this automatic printing, +and @sc{sed} will only produce output when explicitly told to +via the @code{p} command. + +@item -e @var{script} +@itemx --expression=@var{script} +@opindex -e +@opindex --expression +@cindex Script, from command line +Add the commands in @var{script} to the set of commands to be +run while processing the input. + +@item -f @var{script-file} +@itemx --file=@var{script-file} +@opindex -f +@opindex --file +@cindex Script, from a file +Add the commands contained in the file @var{script-file} +to the set of commands to be run while processing the input. + +@end table + +If no @code{-e}, @code{-f}, @code{--expression}, or @code{--file} +options are given on the command-line, +then the first non-option argument on the command line is +taken to be the @var{script} to be executed. + +@cindex Files to be processed as input +If any command-line parameters remain after processing the above, +these parameters are interpreted as the names of input files to +be processed. +@cindex Standard input, processing as input +A file name of @code{-} refers to the standard input stream. +The standard input will be processed if no file names are specified. + + +@node sed Programs, Examples, Invoking SED, Top +@chapter @sc{sed} Programs + +@cindex @sc{sed} program structure +@cindex Script structure +A @sc{sed} program consists of one or more @sc{sed} commands, +passed in by one or more of the +@code{-e}, @code{-f}, @code{--expression}, and @code{--file} +options, or the first non-option argument if zero of these +options are used. +This document will refer to ``the'' @sc{sed} script; +this will be understood to mean the in-order catenation +of all of the @var{script}s and @var{script-file}s passed in. + +Each @sc{sed} command consists of an optional address or +address range, followed by a one-character command name +and any additional command-specific code. + +@menu +* Addresses:: Selecting lines with @sc{sed} +* Regular Expressions:: Overview of regular expression syntax +* Data Spaces:: Where @sc{sed} buffers data +* Common Commands:: Often used commands +* Other Commands:: Less frequently used commands +* Programming Commands:: Commands for die-hard @sc{sed} programmers +@end menu + + +@node Addresses, Regular Expressions, sed Programs, sed Programs +@section Selecting lines with @sc{sed} +@cindex Addresses, in @sc{sed} scripts +@cindex Line selection +@cindex Selecting lines to process + +Addresses in a @sc{sed} script can be in any of the following forms: +@table @samp +@item @var{number} +@cindex Address, numeric +@cindex Line, selecting by number +Specifying a line number will match only that line in the input. +(Note that @sc{sed} counts lines continuously across all input files.) + +@item @var{first}~@var{step} +@cindex @sc{GNU} extensions, @code{@var{n}~@var{m}} addresses +This @sc{GNU} extension matches every @var{step}th line +starting with line @var{first}. +In particular, lines will be selected when there exists +a non-negative @var{n} such that the current line-number equals +@var{first} + (@var{n} * @var{step}). +Thus, to select the odd-numbered lines, +one would use @code{1~2}; +to pick every third line starting with the second, @code{2~3} would be used; +to pick every fifth line starting with the tenth, use @code{10~5}; +and @code{50~0} is just an obscure way of saying @code{50}. + +@item $ +@cindex Address, last line +@cindex Last line, selecting +@cindex Line, selecting last +This address matches the last line of the last file of input. + +@item /@var{regexp}/ +@cindex Address, as a regular expression +@cindex Line, selecting by regular expression match +This will select any line which matches the regular expression @var{regexp}. +If @var{regexp} itself includes any @code{/} characters, +each must be escaped by a backslash (@code{\}). + +@item \%@var{regexp}% +(The @code{%} may be replaced by any other single character.) + +@cindex Slash character, in regular expressions +This also matches the regular expression @var{regexp}, +but allows one to use a different delimiter than @code{/}. +This is particularly useful if the @var{regexp} itself contains +a lot of @code{/}s, since it avoids the tedious escaping of every @code{/}. +If @var{regexp} itself includes any delimiter characters, +each must be escaped by a backslash (@code{\}). + +@item /@var{regexp}/I +@itemx \%@var{regexp}%I +@cindex @sc{GNU} extensions, @code{I} modifier +The @code{I} modifier to regular-expression matching is a @sc{GNU} +extension which causes the @var{regexp} to be matched in +a case-insensitive manner. + +@end table + +If no addresses are given, then all lines are matched; +if one address is given, then only lines matching that +address are matched. + +@cindex Range of lines +@cindex Several lines, selecting +An address range can be specified by specifying two addresses +separated by a comma (@code{,}). +An address range matches lines starting from where the first +address matches, and continues until the second address matches +(inclusively). +If the second address is a @var{regexp}, then checking for the +ending match will start with the line @emph{following} the +line which matched the first address. +If the second address is a @var{number} less than (or equal to) +the line matching the first address, +then only the one line is matched. + +@cindex Excluding lines +@cindex Selecting non-matching lines +Appending the @code{!} character to the end of an address +specification will negate the sense of the match. +That is, if the @code{!} character follows an address range, +then only lines which do @emph{not} match the address range +will be selected. +This also works for singleton addresses, +and, perhaps perversely, for the null address. + + +@node Regular Expressions, Data Spaces, Addresses, sed Programs +@section Overview of regular expression syntax + +@c XXX FIXME +[[I may add a brief overview of regular expressions at a later date; +for now see any of the various other documentations for regular +expressions, such as the @sc{awk} info page.]] + + +@node Data Spaces, Common Commands, Regular Expressions, sed Programs +@section Where @sc{sed} buffers data + +@cindex Buffer spaces, pattern and hold +@cindex Spaces, pattern and hold +@cindex Pattern space, definition +@cindex Hold space, definition +@sc{sed} maintains two data buffers: the active @emph{pattern} space, +and the auxiliary @emph{hold} space. +In ``normal'' operation, @sc{sed} reads in one line from the +input stream and places it in the pattern space. +This pattern space is where text manipulations occur. +The hold space is initially empty, but there are commands +for moving data between the pattern and hold spaces. +@c XXX FIXME: explain why this is useful/interesting to know. + + +@node Common Commands, Other Commands, Data Spaces, sed Programs +@section Often used commands + +If you use @sc{sed} at all, you will quite likely want to know +these commands. + +@table @samp +@item # +[No addresses allowed.] + +@findex # (comment) command +@cindex Comments, in scripts +The @code{#} ``command'' begins a comment; +the comment continues until the next newline. + +@cindex Portability, comments +If you are concerned about portability, be aware that +some implementations of @sc{sed} (which are not POSIX.2 +conformant) may only support a single one-line comment, +and then only when the very first character of the script is a @code{#}. + +@findex -n, forcing from within a script +@cindex Caveat --- #n on first line +Warning: if the first two characters of the @sc{sed} script +are @code{#n}, then the @code{-n} (no-autoprint) option is forced. +If you want to put a comment in the first line of your script +and that comment begins with the letter `n' +and you do not want this behavior, +then be sure to either use a capital `N', +or place at least one space before the `n'. + +@item s/@var{regexp}/@var{replacement}/@var{flags} +(The @code{/} characters may be uniformly replaced by +any other single character within any given @code{s} command.) + +@findex s (substitute) command +@cindex Substitution of text +@cindex Replacing text matching regexp +The @code{/} character (or whatever other character is used in its stead) +can appear in the @var{regexp} or @var{replacement} +only if it is preceded by a @code{\} character. +Also newlines may appear in the @var{regexp} using the two +character sequence @code{\n}. + +The @code{s} command attempts to match the pattern +space against the supplied @var{regexp}. +If the match is successful, then that portion of the pattern +space which was matched is replaced with @var{replacement}. + +@cindex Backreferences, in regular expressions +@cindex Parenthesized substrings +The @var{replacement} can contain @code{\@var{n}} (@var{n} being +a number from 1 to 9, inclusive) references, which refer to +the portion of the match which is contained between the @var{n}th +@code{\(} and its matching @code{\)}. +Also, the @var{replacement} can contain unescaped @code{&} +characters which will reference the whole matched portion +of the pattern space. +To include a literal @code{\}, @code{&}, or newline in the final +replacement, be sure to precede the desired @code{\}, @code{&}, +or newline in the @var{replacement} with a @code{\}. + +@findex s command, option flags +@cindex Substitution of text, options +@cindex Replacing text matching regexp, options +The @code{s} command can be followed with zero or more of the +following @var{flags}: + +@table @samp +@item g +@cindex Global substitution +@cindex Replacing all text matching regexp in a line +Apply the replacement to @emph{all} matches to the @var{regexp}, +not just the first. +@item p +@cindex Printing text after substitution +If the substitution was made, then print the new pattern space. +@item @var{number} +@cindex Replacing only @var{n}th match of regexp in a line +Only replace the @var{number}th match of the @var{regexp}. +@item w @var{file-name} +@cindex Write result of a substitution to file +If the substitution was made, then write out the result to the named file. +@item I +(This is a @sc{GNU} extension.) + +@cindex @sc{GNU} extensions, @code{I} modifier +@cindex Case-insensitive matching +Match @var{regexp} in a case-insensitive manner. +@end table + +@item q +[At most one address allowed.] + +@findex q (quit) command +@cindex Quitting +Exit @sc{sed} without processing any more commands or input. +Note that the current pattern space is printed +if auto-print is not disabled. + +@item d +@findex d (delete) command +@cindex Deleting lines +Delete the pattern space; +immediately start next cycle. + +@item p +@findex p (print) command +@cindex Print selected lines +Print out the pattern space (to the standard output). +This command is usually only used in conjunction with the @code{-n} +command-line option. + +@cindex Caveat --- @code{p} command and -n flag +Note: some implementations of @sc{sed}, such as this one, will +double-print lines when auto-print is not disabled and the @code{p} +command is given. +Other implementations will only print the line once. +Both ways conform with the POSIX.2 standard, and so neither +way can be considered to be in error. +@cindex Portability, @code{p} command and -n flag +Portable @sc{sed} scripts should thus avoid relying on either behavior; +either use the @code{-n} option and explicitly print what you want, +or avoid use of the @code{p} command (and also the @code{p} flag to the +@code{s} command). + +@item n +@findex n (next-line) command +@cindex Next input line, replace pattern space with +@cindex Read next input line +If auto-print is not disabled, print the pattern space, +then, regardless, replace the pattern space with the next line of input. +If there is no more input then @sc{sed} exits without processing +any more commands. + +@item @{ @var{commands} @} +@findex @{@} command grouping +@cindex Grouping commands +@cindex Command groups +A group of commands may be enclosed between +@code{@{} and @code{@}} characters. +(The @code{@}} must appear in a zero-address command context.) +This is particularly useful when you want a group of commands +to be triggered by a single address (or address-range) match. + +@end table + + +@node Other Commands, Programming Commands, Common Commands, sed Programs +@section Less frequently used commands + +Though perhaps less frequently used than those in the previous +section, some very small yet useful @sc{sed} scripts can be built with +these commands. + +@table @samp +@item y/@var{source-chars}/@var{dest-chars}/ +(The @code{/} characters may be uniformly replaced by +any other single character within any given @code{y} command.) + +@findex y (transliterate) command +@cindex Transliteration +Transliterate any characters in the pattern space which match +any of the @var{source-chars} with the corresponding character +in @var{dest-chars}. + +Instances of the @code{/} (or whatever other character is used in its stead), +@code{\}, or newlines can appear in the @var{source-chars} or @var{dest-chars} +lists, provide that each instance is escaped by a @code{\}. +The @var{source-chars} and @var{dest-chars} lists @emph{must} +contain the same number of characters (after de-escaping). + +@c XXX was getting a bad page break; remove this @need if formatting changes +@need 1000 +@item a\ +@itemx @var{text} +[At most one address allowed.] + +@findex a (append text lines) command +@cindex Adding a block of text after a line +@cindex Text, appending +Queue the lines of text which follow this command +(each but the last ending with a @code{\}, +which will be removed from the output) +to be output at the end of the current cycle, +or when the next input line is read. + +@item i\ +@itemx @var{text} +[At most one address allowed.] + +@findex i (insert text lines) command +@cindex Inserting a block of text before a line +@cindex Text, insertion +Immediately output the lines of text which follow this command +(each but the last ending with a @code{\}, +which will be removed from the output). + +@item c\ +@itemx @var{text} +@findex c (change to text lines) command +@cindex Replace specific input lines +@cindex Selected lines, replacing +Delete the lines matching the address or address-range, +and output the lines of text which follow this command +(each but the last ending with a @code{\}, +which will be removed from the output) +in place of the last line +(or in place of each line, if no addresses were specified). +A new cycle is started after this command is done, +since the pattern space will have been deleted. + +@item = +[At most one address allowed.] + +@findex = (print line number) command +@cindex Print line number +@cindex Line number, print +Print out the current input line number (with a trailing newline). + +@item l +@findex l (list unambiguously) command +@cindex List pattern space +@cindex Print unambiguous representation of pattern space +Print the pattern space in an unambiguous form: +non-printable characters (and the @code{\} character) +are printed in C-style escaped form; +long lines are split, with a trailing @code{\} character +to indicate the split; the end of each line is marked +with a @code{$}. + +@item r @var{filename} +[At most one address allowed.] + +@findex r (read file) command +@cindex Read text from a file +@cindex Insert text from a file +Queue the contents of @var{filename} to be read and +inserted into the output stream at the end of the current cycle, +or when the next input line is read. +Note that if @var{filename} cannot be read, it is treated as +if it were an empty file, without any error indication. + +@item w @var{filename} +@findex w (write file) command +@cindex Write to a file +Write the pattern space to @var{filename}. +The @var{filename} will be created (or truncated) before the +first input line is read; all @code{w} commands (including +instances of @code{w} flag on successful @code{s} commands) +which refer to the same @var{filename} are output through +the same @sc{FILE} stream. + +@item D +@findex D (delete first line) command +@cindex Delete first line from pattern space +Delete text in the pattern space up to the first newline. +If any text is left, restart cycle with the resultant +pattern space (without reading a new line of input), +otherwise start a normal new cycle. + +@item N +@findex N (append Next line) command +@cindex Next input line, append to pattern space +@cindex Append next input line to pattern space +Add a newline to the pattern space, +then append the next line of input to the pattern space. +If there is no more input then @sc{sed} exits without processing +any more commands. + +@item P +@findex P (print first line) command +@cindex Print first line from pattern space +Print out the portion of the pattern space up to the first newline. + +@item h +@findex h (hold) command +@cindex Copy pattern space into hold space +@cindex Replace hold space with copy of pattern space +@cindex Hold space, copying pattern space into +Replace the contents of the hold space with the contents of the pattern space. + +@item H +@findex H (append Hold) command +@cindex Append pattern space to hold space +@cindex Hold space, appending from pattern space +Append a newline to the contents of the hold space, +and then append the contents of the pattern space to that of the hold space. + +@item g +@findex g (get) command +@cindex Copy hold space into pattern space +@cindex Replace pattern space with copy of hold space +@cindex Hold space, copy into pattern space +Replace the contents of the pattern space with the contents of the hold space. + +@item G +@findex G (appending Get) command +@cindex Append hold space to pattern space +@cindex Hold space, appending to pattern space +Append a newline to the contents of the pattern space, +and then append the contents of the hold space to that of the pattern space. + +@item x +@findex x (eXchange) command +@cindex Exchange hold space with pattern space +@cindex Hold space, exchange with pattern space +Exchange the contents of the hold and pattern spaces. + +@end table + + +@node Programming Commands, , Other Commands, sed Programs +@section Commands for die-hard @sc{sed} programmers + +In most cases, use of these commands indicates that you are +probably better off programming in something like @sc{perl}. +But occasionally one is committed to sticking with @sc{sed}, +and these commands can enable one to write quite convoluted +scripts. + +@cindex Flow of control in scripts +@table @samp +@item : @var{label} +[No addresses allowed.] + +@findex : (label) command +@cindex Labels, in scripts +Specify the location of @var{label} for the @code{b} and @code{t} commands. +In all other respects, a no-op. + +@item b @var{label} +@findex b (branch) command +@cindex Branch to a label, unconditionally +@cindex Goto, in scripts +Unconditionally branch to @var{label}. +The @var{label} may be omitted, in which case the next cycle is started. + +@item t @var{label} +@findex t (conditional branch) command +@cindex Branch to a label, if @code{s///} succeeded +@cindex Conditional branch +Branch to @var{label} only if there has been a successful @code{s}ubstitution +since the last input line was read or @code{t} branch was taken. +The @var{label} may be omitted, in which case the next cycle is started. + +@end table + + +@node Examples, Limitations, sed Programs, Top +@chapter Some sample scripts + +@c XXX FIXME +[[Not this release, sorry. +But check out the scripts in the testsuite directory, +and the amazing dc.sed script in the +top-level directory of this distribution.]] + + +@node Limitations, Other Resources, Examples, Top +@chapter About the (non-)limitations on line length + +@cindex @sc{GNU} extensions, unlimited line length +@cindex Portability, line length limitations +For those who want to write portable @sc{sed} scripts, +be aware that some implementations have been known to +limit line lengths (for the pattern and hold spaces) +to be no more than 4000 bytes. +The POSIX.2 standard specifies that conforming @sc{sed} +implementations shall support at least 8192 byte line lengths. +@sc{GNU} @sc{sed} has no built-in limit on line length; +as long as @sc{sed} can malloc() more (virtual) memory, +it will allow lines as long as you care to feed it +(or construct within it). + +@node Other Resources, Reporting Bugs, Limitations, Top +@chapter Other resources for learning about @sc{sed} + +@cindex Addtional reading about @sc{sed} +In addition to several books that have been written about @sc{sed} +(either specifically or as chapters in books which discuss +shell programming), one can find out more about @sc{sed} +(including suggestions of a few books) from the FAQ +for the seders mailing list, available from any of: +@display + @uref{http://www.dbnet.ece.ntua.gr/~george/sed/sedfaq.html} + @uref{http://www.ptug.org/sed/sedfaq.htm} +@end display +Also of interest is @uref{http://seders.icheme.org/tutorials/}. + +There is an informal ``seders'' mailing list manually maintained +by Al Aab. To subscribe, send e-mail to @email{af137@@torfree.net} +with a brief description of your interest. + +@node Reporting Bugs, Concept Index, Other Resources, Top +@chapter Reporting bugs + +@cindex Bugs, reporting +Email bug reports to @email{bug-gnu-utils@@gnu.org}. +Be sure to include the word ``sed'' somewhere in the ``Subject:'' field. + +@c XXX FIXME: the term "cycle" is never defined... + +@page +@node Concept Index, Command and Option Index, Reporting Bugs, Top +@unnumbered Concept Index + +This is a general index of all issues discussed in this manual, with the +exception of the @sc{sed} commands and command-line options. + +@printindex cp + +@page +@node Command and Option Index, , Concept Index, Top +@unnumbered Command and Option Index + +This is an alphabetical list of all @sc{sed} commands and command-line +opions. + +@printindex fn + +@contents +@bye |