summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorBernhard Rosenkraenzer <bero@arklinux.org>2006-08-18 21:26:54 +0000
committerBernhard Rosenkraenzer <bero@arklinux.org>2006-08-18 21:26:54 +0000
commitf149209e9ef3994cd55c27fd563262d78d18ac32 (patch)
treeda0a9ea31cc31d049f7d3f559218688d38f4581e /doc
parentd02c42bc884d961bccbc0aab721c2225ca8e4094 (diff)
downloadgrep-f149209e9ef3994cd55c27fd563262d78d18ac32.tar.gz
Apply documentation patch (#4610)
Diffstat (limited to 'doc')
-rw-r--r--doc/grep.11124
-rw-r--r--doc/grep.texi1441
2 files changed, 1656 insertions, 909 deletions
diff --git a/doc/grep.1 b/doc/grep.1
index 2f680bcd..1f0e7214 100644
--- a/doc/grep.1
+++ b/doc/grep.1
@@ -1,4 +1,4 @@
-.\" grep man page
+.\" GNU grep man page
.if !\n(.g \{\
. if !\w|\*(lq| \{\
. ds lq ``
@@ -9,37 +9,48 @@
. if \w'\(rq' .ds rq "\(rq
. \}
.\}
+.ie t .ds Tx \s-1T\v'.4n'\h'-.1667'E\v'-.4n'\h'-.125'X\s0
+. el .ds Tx TeX
.de Id
-.ds Dt \\$4
+. ds Yr \\$4
+. substring Yr 0 3
+. ds Mn \\$4
+. substring Mn 5 6
+. ds Dy \\$4
+. substring Dy 8 9
+. \" ISO 8601 date, complete format, extended representation
+. ds Dt \\*(Yr-\\*(Mn-\\*(Dy
..
-.Id $Id: grep.1,v 1.36 2005/11/09 20:04:41 charles_levert Exp $
-.TH GREP 1 \*(Dt "GNU Project"
+.Id $Id: grep.1,v 1.37 2006/08/18 21:26:54 bero Exp $
+.TH GREP 1 \*(Dt "GNU grep 2.5.1-cvs" "User Commands"
+.hy 0
+.
.SH NAME
grep, egrep, fgrep \- print lines matching a pattern
+.
.SH SYNOPSIS
.B grep
-.RI [ options ]
+.RI [ OPTIONS ]
.I PATTERN
.RI [ FILE .\|.\|.]
.br
.B grep
-.RI [ options ]
+.RI [ OPTIONS ]
.RB [ \-e
.I PATTERN
|
.B \-f
.IR FILE ]
.RI [ FILE .\|.\|.]
+.
.SH DESCRIPTION
-.PP
-The
.B grep
-command searches the named input
+searches the named input
.IR FILE s
-(or standard input if no files are named, or
-the file name
-.B \-
-is given)
+(or standard input if no files are named,
+or if a single hyphen-minus
+.RB ( \- )
+is given as file name)
for lines containing a match to the given
.IR PATTERN .
By default,
@@ -64,229 +75,156 @@ or
is deprecated,
but is provided to allow historical applications
that rely on them to run unmodified.
+.
.SH OPTIONS
+.SS "Generic Program Information"
.TP
-.BI \-A " NUM" "\fR,\fP \-\^\-after-context=" NUM
-Print
-.I NUM
-lines of trailing context after matching lines.
-Places a line containing
-.B \-\^\-
-between contiguous groups of matches.
-.TP
-.BR \-a ", " \-\^\-text
-Process a binary file as if it were text; this is equivalent to the
-.B \-\^\-binary-files=text
-option.
-.TP
-.BI \-B " NUM" "\fR,\fP \-\^\-before-context=" NUM
-Print
-.I NUM
-lines of leading context before matching lines.
-Places a line containing
-.B \-\^\-
-between contiguous groups of matches.
-.TP
-.BI \-C " NUM" "\fR,\fP \-\^\-context=" NUM
-Print
-.I NUM
-lines of output context.
-Places a line containing
-.B \-\^\-
-between contiguous groups of matches.
-.TP
-.BR \-b ", " \-\^\-byte-offset
-Print the byte offset within the input file before
-each line of output.
+.B \-\^\-help
+Print a usage message briefly summarizing these command-line options
+and the bug-reporting address, then exit.
.TP
-.BI \-\^\-binary-files= TYPE
-If the first few bytes of a file indicate that the file contains binary
-data, assume that the file is of type
-.IR TYPE .
-By default,
-.I TYPE
-is
-.BR binary ,
-and
-.B grep
-normally outputs either
-a one-line message saying that a binary file matches, or no message if
-there is no match.
-If
-.I TYPE
-is
-.BR without-match ,
-.B grep
-assumes that a binary file does not match; this is equivalent to the
-.B \-I
-option.
-If
-.I TYPE
-is
-.BR text ,
+.BR \-V ", " \-\^\-version
+Print the version number of
.B grep
-processes a binary file as if it were text; this is equivalent to the
-.B \-a
-option.
-.I Warning:
-.B "grep \-\^\-binary-files=text"
-might output binary garbage,
-which can have nasty side effects if the output is a terminal and if the
-terminal driver interprets some of it as commands.
-.TP
-.BI \-\^\-color[=\fIWHEN\fR] ", " \-\^\-colour[=\fIWHEN\fR]
-Surround the matching non-empty strings, matching lines, context lines,
-file names, line numbers, octet offsets, and separators (for fields and
-groups of context lines) with escape sequences to display them in color
-on the terminal.
-The colors are defined by the environment variable
-.BR GREP_COLORS .
-The deprecated environment variable
-.B GREP_COLOR
-is still supported, but its setting does not have priority.
-.I WHEN
-is `never', `always', or `auto'.
+to the standard output stream.
+This version number should
+be included in all bug reports (see below).
+.SS "Matcher Selection"
.TP
-.BR \-c ", " \-\^\-count
-Suppress normal output; instead print a count of
-matching lines for each input file.
-With the
-.BR \-v ", " \-\^\-invert-match
-option (see below), count non-matching lines.
+.BR \-E ", " \-\^\-extended\-regexp
+Interpret
+.I PATTERN
+as an extended regular expression (ERE, see below).
+.RB ( \-E
+is specified by \s-1POSIX\s0.)
.TP
-.BI \-D " ACTION" "\fR,\fP \-\^\-devices=" ACTION
-If an input file is a device, FIFO or socket, use
-.I ACTION
-to process it. By default,
-.I ACTION
-is
-.BR read ,
-which means that devices are read just as if they were ordinary files.
-If
-.I ACTION
-is
-.BR skip ,
-devices are silently skipped.
+.BR \-F ", " \-\^\-fixed\-strings
+Interpret
+.I PATTERN
+as a list of fixed strings, separated by newlines,
+any of which is to be matched.
+.RB ( \-F
+is specified by \s-1POSIX\s0.)
.TP
-.BI \-d " ACTION" "\fR,\fP \-\^\-directories=" ACTION
-If an input file is a directory, use
-.I ACTION
-to process it. By default,
-.I ACTION
-is
-.BR read ,
-which means that directories are read just as if they were ordinary files.
-If
-.I ACTION
-is
-.BR skip ,
-directories are silently skipped.
-If
-.I ACTION
-is
-.BR recurse ,
-.B grep
-reads all files under each directory, recursively;
-this is equivalent to the
-.B \-r
-option.
+.BR \-G ", " \-\^\-basic\-regexp
+Interpret
+.I PATTERN
+as a basic regular expression (BRE, see below).
+This is the default.
.TP
-.BR \-E ", " \-\^\-extended-regexp
+.BR \-P ", " \-\^\-perl\-regexp
Interpret
.I PATTERN
-as an extended regular expression (see below).
+as a Perl regular expression.
+This is highly experimental and
+.B "grep \-P"
+may warn of unimplemented features.
+.SS "Matching Control"
.TP
.BI \-e " PATTERN" "\fR,\fP \-\^\-regexp=" PATTERN
Use
.I PATTERN
-as the pattern; useful to protect patterns beginning with
-.BR \- .
-.TP
-.BI \-\^\-exclude= FILE_PATTERN
-.RI "Skip files " "and directories" " that match " FILE_PATTERN.
-.TP
-.BR \-F ", " \-\^\-fixed-strings
-Interpret
-.I PATTERN
-as a list of fixed strings, separated by newlines,
-any of which is to be matched.
+as the pattern.
+This is useful to protect patterns beginning with hyphen-minus
+.RB ( \- ).
+.RB ( \-e
+is specified by \s-1POSIX\s0.)
.TP
.BI \-f " FILE" "\fR,\fP \-\^\-file=" FILE
Obtain patterns from
.IR FILE ,
one per line.
The empty file contains zero patterns, and therefore matches nothing.
+.RB ( \-f
+is specified by \s-1POSIX\s0.)
.TP
-.BR \-G ", " \-\^\-basic-regexp
-Interpret
+.BR \-i ", " \-\^\-ignore\-case
+Ignore case distinctions in both the
.I PATTERN
-as a basic regular expression (see below). This is the default.
+and the input files.
+.RB ( \-i
+is specified by \s-1POSIX\s0.)
.TP
-.BR \-H ", " \-\^\-with-filename
-Print the filename for each match.
+.BR \-v ", " \-\^\-invert\-match
+Invert the sense of matching, to select non-matching lines.
+.RB ( \-v
+is specified by \s-1POSIX\s0.)
.TP
-.BR \-h ", " \-\^\-no-filename
-Suppress the prefixing of filenames on output
-when multiple files are searched.
+.BR \-w ", " \-\^\-word\-regexp
+Select only those lines containing matches that form whole words.
+The test is that the matching substring must either be at the
+beginning of the line, or preceded by a non-word constituent
+character.
+Similarly, it must be either at the end of the line
+or followed by a non-word constituent character.
+Word-constituent characters are letters, digits, and the underscore.
.TP
-.B \-\^\-help
-Output a brief help message.
+.BR \-x ", " \-\^\-line\-regexp
+Select only those matches that exactly match the whole line.
+.RB ( \-x
+is specified by \s-1POSIX\s0.)
.TP
-.BR \-I
-Process a binary file as if it did not contain matching data; this is
-equivalent to the
-.B \-\^\-binary-files=without-match
-option.
+.B \-y
+Obsolete synonym for
+.BR \-i .
+.SS "General Output Control"
.TP
-.BR \-i ", " \-\^\-ignore-case
-Ignore case distinctions in both the
-.I PATTERN
-and the input files.
+.BR \-c ", " \-\^\-count
+Suppress normal output; instead print a count of
+matching lines for each input file.
+With the
+.BR \-v ", " \-\^\-invert\-match
+option (see below), count non-matching lines.
+.RB ( \-c
+is specified by \s-1POSIX\s0.)
.TP
-.BI \-\^\-include= FILE_PATTERN
-Search only files that match
-.I FILE_PATTERN
-(using wildcard matching).
+.BR \-\^\-color [ =\fIWHEN\fP "], " \-\^\-colour [ =\fIWHEN\fP ]
+Surround the matched (non-empty) strings, matching lines, context lines,
+file names, line numbers, byte offsets, and separators (for fields and
+groups of context lines) with escape sequences to display them in color
+on the terminal.
+The colors are defined by the environment variable
+.BR GREP_COLORS .
+The deprecated environment variable
+.B GREP_COLOR
+is still supported, but its setting does not have priority.
+.I WHEN
+is
+.BR never ", " always ", or " auto .
.TP
-.BR \-L ", " \-\^\-files-without-match
+.BR \-L ", " \-\^\-files\-without\-match
Suppress normal output; instead print the name
of each input file from which no output would
-normally have been printed. The scanning will stop
-on the first match.
+normally have been printed.
+The scanning will stop on the first match.
.TP
-.BR \-l ", " \-\^\-files-with-matches
+.BR \-l ", " \-\^\-files\-with\-matches
Suppress normal output; instead print
the name of each input file from which output
-would normally have been printed. The scanning will
-stop on the first match.
-.TP
-.BI \-\^\-label= LABEL
-Display input actually coming from standard input as input coming from file
-.I LABEL.
-This is especially useful for tools like
-.BR zgrep ,
-e.g.,
-.B "gzip -cd foo.gz | grep --label=foo something"
+would normally have been printed.
+The scanning will stop on the first match.
+.RB ( \-l
+is specified by \s-1POSIX\s0.)
.TP
-.BR \-\^\-line-buffered
-Use line buffering, it can be a performance penalty.
-.TP
-.BI \-m " NUM" "\fR,\fP \-\^\-max-count=" NUM
+.BI \-m " NUM" "\fR,\fP \-\^\-max\-count=" NUM
Stop reading a file after
.I NUM
-matching lines. If the input is standard input from a regular file,
+matching lines.
+If the input is standard input from a regular file,
and
.I NUM
matching lines are output,
.B grep
ensures that the standard input is positioned to just after the last
matching line before exiting, regardless of the presence of trailing
-context lines. This enables a calling process to resume a search.
+context lines.
+This enables a calling process to resume a search.
When
.B grep
stops after
.I NUM
-matching lines, it outputs any trailing context lines. When the
+matching lines, it outputs any trailing context lines.
+When the
.B \-c
or
.B \-\^\-count
@@ -297,38 +235,16 @@ does not output a count greater than
When the
.B \-v
or
-.B \-\^\-invert-match
+.B \-\^\-invert\-match
option is also used,
.B grep
stops after outputting
.I NUM
non-matching lines.
.TP
-.B \-\^\-mmap
-If possible, use the
-.BR mmap (2)
-system call to read input, instead of
-the default
-.BR read (2)
-system call. In some situations,
-.B \-\^\-mmap
-yields better performance. However,
-.B \-\^\-mmap
-can cause undefined behavior (including core dumps)
-if an input file shrinks while
-.B grep
-is operating, or if an I/O error occurs.
-.TP
-.BR \-n ", " \-\^\-line-number
-Prefix each line of output with the line number
-within its input file.
-.TP
-.BR \-o ", " \-\^\-only-matching
-Show only the non-empty parts of a matching line that match
-.IR PATTERN .
-.TP
-.BR \-P ", " \-\^\-perl-regexp
-.RI "Interpret " PATTERN " as a Perl regular expression."
+.BR \-o ", " \-\^\-only\-matching
+Print only the matched (non-empty) parts of a matching line,
+with each such part on a separate output line.
.TP
.BR \-q ", " \-\^\-quiet ", " \-\^\-silent
Quiet; do not write anything to standard output.
@@ -337,16 +253,12 @@ even if an error was detected.
Also see the
.B \-s
or
-.B \-\^\-no-messages
-option.
-.TP
-.BR \-R ", " \-r ", " \-\^\-recursive
-Read all files under each directory, recursively;
-this is equivalent to the
-.B "\-d recurse"
+.B \-\^\-no\-messages
option.
+.RB ( \-q
+is specified by \s-1POSIX\s0.)
.TP
-.BR \-s ", " \-\^\-no-messages
+.BR \-s ", " \-\^\-no\-messages
Suppress error messages about nonexistent or unreadable files.
Portability note: unlike \s-1GNU\s0
.BR grep ,
@@ -376,77 +288,67 @@ and
and should redirect standard and error output to
.B /dev/null
instead.
+.RB ( \-s
+is specified by \s-1POSIX\s0.)
+.SS "Output Line Prefix Control"
+.TP
+.BR \-b ", " \-\^\-byte\-offset
+Print the 0-based byte offset within the input file
+before each line of output.
+If
+.B \-o
+.RB ( \-\^\-only\-matching )
+is specified,
+print the offset of the matching part itself.
+.TP
+.BR \-H ", " \-\^\-with\-filename
+Print the file name for each match.
+This is the default when there is more than one file to search.
+.TP
+.BR \-h ", " \-\^\-no\-filename
+Suppress the prefixing of file names on output.
+This is the default when there is only one file
+(or only standard input) to search.
+.TP
+.BI \-\^\-label= LABEL
+Display input actually coming from standard input as input coming from file
+.I LABEL.
+This is especially useful for tools like
+.BR zgrep ,
+e.g.,
+.B "gzip -cd foo.gz | grep --label=foo something"
+.TP
+.BR \-n ", " \-\^\-line\-number
+Prefix each line of output with the 1-based line number
+within its input file.
+.RB ( \-n
+is specified by \s-1POSIX\s0.)
.TP
.BR \-T ", " \-\^\-initial\-tab
-Makes sure that the first character of actual line content lies on a
+Make sure that the first character of actual line content lies on a
tab stop, so that the alignment of tabs looks normal.
-This is useful when combined with
-.B \-H
-(which is implicit when there is more than one file to search),
-.BR \-n ,
-and
-.BR \-b ;
-these options prepend their output at the beginning of the displayed
-line, before the actual content.
-In order to improve the probability that all matched or context lines
-from a single file will all start at the same column, this also causes
-the line number and octet offset (if present) to be printed in a minimum
-size field width.
-.TP
-.BR \-U ", " \-\^\-binary
-Treat the file(s) as binary. By default, under MS-DOS and MS-Windows,
-.BR grep
-guesses the file type by looking at the contents of the first 32KB
-read from the file. If
-.BR grep
-decides the file is a text file, it strips the CR characters from the
-original file contents (to make regular expressions with
-.B ^
+This is useful with options that prefix their output to the actual content:
+.BR \-H , \-n ,
and
-.B $
-work correctly). Specifying
-.B \-U
-overrules this guesswork, causing all files to be read and passed to the
-matching mechanism verbatim; if the file is a text file with CR/LF
-pairs at the end of each line, this will cause some regular
-expressions to fail.
-This option has no effect on platforms other than MS-DOS and
-MS-Windows.
+.BR \-b .
+In order to improve the probability that lines
+from a single file will all start at the same column,
+this also causes the line number and byte offset (if present)
+to be printed in a minimum size field width.
.TP
-.BR \-u ", " \-\^\-unix-byte-offsets
-Report Unix-style byte offsets. This switch causes
+.BR \-u ", " \-\^\-unix\-byte\-offsets
+Report Unix-style byte offsets.
+This switch causes
.B grep
-to report byte offsets as if the file were Unix-style text file, i.e., with
-CR characters stripped off. This will produce results identical to running
+to report byte offsets as if the file were a Unix-style text file,
+i.e., with CR characters stripped off.
+This will produce results identical to running
.B grep
-on a Unix machine. This option has no effect unless
+on a Unix machine.
+This option has no effect unless
.B \-b
option is also used;
-it has no effect on platforms other than MS-DOS and MS-Windows.
-.TP
-.BR \-V ", " \-\^\-version
-Print the version number of
-.B grep
-to standard error. This version number should
-be included in all bug reports (see below).
-.TP
-.BR \-v ", " \-\^\-invert-match
-Invert the sense of matching, to select non-matching lines.
-.TP
-.BR \-w ", " \-\^\-word-regexp
-Select only those lines containing matches that form whole words.
-The test is that the matching substring must either be at the
-beginning of the line, or preceded by a non-word constituent
-character. Similarly, it must be either at the end of the line
-or followed by a non-word constituent character. Word-constituent
-characters are letters, digits, and the underscore.
-.TP
-.BR \-x ", " \-\^\-line-regexp
-Select only those matches that exactly match the whole line.
-.TP
-.B \-y
-Obsolete synonym for
-.BR \-i .
+it has no effect on platforms other than \s-1MS-DOS\s0 and \s-1MS\s0-Windows.
.TP
.BR \-Z ", " \-\^\-null
Output a zero byte (the \s-1ASCII\s0
@@ -456,8 +358,8 @@ For example,
.B "grep \-lZ"
outputs a zero byte after each file name instead of the usual newline.
This option makes the output unambiguous, even in the presence of file
-names containing unusual characters like newlines. This option can be
-used with commands like
+names containing unusual characters like newlines.
+This option can be used with commands like
.BR "find \-print0" ,
.BR "perl \-0" ,
.BR "sort \-z" ,
@@ -465,15 +367,231 @@ and
.B "xargs \-0"
to process arbitrary file names,
even those that contain newline characters.
+.SS "Context Line Control"
+.TP
+.BI \-A " NUM" "\fR,\fP \-\^\-after\-context=" NUM
+Print
+.I NUM
+lines of trailing context after matching lines.
+Places a line containing a group separator
+.RB ( \-\^\- )
+between contiguous groups of matches.
+With the
+.B \-o
+or
+.B \-\^\-only\-matching
+option, this has no effect and a warning is given.
+.TP
+.BI \-B " NUM" "\fR,\fP \-\^\-before\-context=" NUM
+Print
+.I NUM
+lines of leading context before matching lines.
+Places a line containing a group separator
+.RB ( \-\^\- )
+between contiguous groups of matches.
+With the
+.B \-o
+or
+.B \-\^\-only\-matching
+option, this has no effect and a warning is given.
+.TP
+.BI \-C " NUM" "\fR,\fP \-" NUM "\fR,\fP \-\^\-context=" NUM
+Print
+.I NUM
+lines of output context.
+Places a line containing a group separator
+.RB ( \-\^\- )
+between contiguous groups of matches.
+With the
+.B \-o
+or
+.B \-\^\-only\-matching
+option, this has no effect and a warning is given.
+.SS "File and Directory Selection"
+.TP
+.BR \-a ", " \-\^\-text
+Process a binary file as if it were text; this is equivalent to the
+.B \-\^\-binary\-files=text
+option.
+.TP
+.BI \-\^\-binary\-files= TYPE
+If the first few bytes of a file indicate that the file contains binary
+data, assume that the file is of type
+.IR TYPE .
+By default,
+.I TYPE
+is
+.BR binary ,
+and
+.B grep
+normally outputs either
+a one-line message saying that a binary file matches, or no message if
+there is no match.
+If
+.I TYPE
+is
+.BR without-match ,
+.B grep
+assumes that a binary file does not match; this is equivalent to the
+.B \-I
+option.
+If
+.I TYPE
+is
+.BR text ,
+.B grep
+processes a binary file as if it were text; this is equivalent to the
+.B \-a
+option.
+.I Warning:
+.B "grep \-\^\-binary\-files=text"
+might output binary garbage,
+which can have nasty side effects if the output is a terminal and if the
+terminal driver interprets some of it as commands.
+.TP
+.BI \-D " ACTION" "\fR,\fP \-\^\-devices=" ACTION
+If an input file is a device, FIFO or socket, use
+.I ACTION
+to process it.
+By default,
+.I ACTION
+is
+.BR read ,
+which means that devices are read just as if they were ordinary files.
+If
+.I ACTION
+is
+.BR skip ,
+devices are silently skipped.
+.TP
+.BI \-d " ACTION" "\fR,\fP \-\^\-directories=" ACTION
+If an input file is a directory, use
+.I ACTION
+to process it.
+By default,
+.I ACTION
+is
+.BR read ,
+which means that directories are read just as if they were ordinary files.
+If
+.I ACTION
+is
+.BR skip ,
+directories are silently skipped.
+If
+.I ACTION
+is
+.BR recurse ,
+.B grep
+reads all files under each directory, recursively;
+this is equivalent to the
+.B \-r
+option.
+.TP
+.BI \-\^\-exclude= GLOB
+Skip files whose base name matches
+.I GLOB
+(using wildcard matching).
+A file-name glob can use
+.BR * ,
+.BR ? ,
+and
+.BR [ ... ]
+as wildcards, and
+.B \e
+to quote a wildcard or backslash character literally.
+.TP
+.BI \-\^\-exclude-from= FILE
+Skip files
+.I and directories
+whose base name matches any of the file-name globs read from
+.I FILE
+(using wildcard matching as described under
+.BR \-\^\-exclude ).
+.TP
+.BR \-I
+Process a binary file as if it did not contain matching data; this is
+equivalent to the
+.B \-\^\-binary\-files=without-match
+option.
+.TP
+.BI \-\^\-include= GLOB
+Search only files whose base name matches
+.I GLOB
+(using wildcard matching as described under
+.BR \-\^\-exclude ).
+.TP
+.BR \-R ", " \-r ", " \-\^\-recursive
+Read all files under each directory, recursively;
+this is equivalent to the
+.B "\-d recurse"
+option.
+.SS "Other Options"
+.TP
+.BR \-\^\-line\-buffered
+Use line buffering on output.
+This can cause a performance penalty.
+.TP
+.B \-\^\-mmap
+If possible, use the
+.BR mmap (2)
+system call to read input, instead of
+the default
+.BR read (2)
+system call.
+In some situations,
+.B \-\^\-mmap
+yields better performance.
+However,
+.B \-\^\-mmap
+can cause undefined behavior (including core dumps)
+if an input file shrinks while
+.B grep
+is operating, or if an I/O error occurs.
+.TP
+.BR \-U ", " \-\^\-binary
+Treat the file(s) as binary.
+By default, under \s-1MS-DOS\s0 and \s-1MS\s0-Windows,
+.BR grep
+guesses the file type by looking at the contents of the first 32KB
+read from the file.
+If
+.BR grep
+decides the file is a text file, it strips the CR characters from the
+original file contents (to make regular expressions with
+.B ^
+and
+.B $
+work correctly).
+Specifying
+.B \-U
+overrules this guesswork, causing all files to be read and passed to the
+matching mechanism verbatim; if the file is a text file with CR/LF
+pairs at the end of each line, this will cause some regular
+expressions to fail.
+This option has no effect on platforms
+other than \s-1MS-DOS\s0 and \s-1MS\s0-Windows.
+.TP
+.BR \-z ", " \-\^\-null\-data
+Treat the input as a set of lines,
+each terminated by a zero byte (the \s-1ASCII\s0
+.B NUL
+character) instead of a newline.
+Like the
+.B -Z
+or
+.B \-\^\-null
+option, this option can be used with commands like
+.B sort -z
+to process arbitrary file names.
+.
.SH "REGULAR EXPRESSIONS"
-.PP
A regular expression is a pattern that describes a set of strings.
Regular expressions are constructed analogously to arithmetic
expressions, by using various operators to combine smaller expressions.
.PP
-The
.B grep
-command understands two different versions of regular expression syntax:
+understands two different versions of regular expression syntax:
\*(lqbasic\*(rq and \*(lqextended.\*(rq In
.RB "\s-1GNU\s0\ " grep ,
there is no difference in available functionality using either syntax.
@@ -481,11 +599,17 @@ In other implementations, basic regular expressions are less powerful.
The following description applies to extended regular expressions;
differences for basic regular expressions are summarized afterwards.
.PP
-The fundamental building blocks are the regular expressions that match
-a single character. Most characters, including all letters and digits,
-are regular expressions that match themselves. Any metacharacter with
-special meaning may be quoted by preceding it with a backslash.
+The fundamental building blocks are the regular expressions
+that match a single character.
+Most characters, including all letters and digits,
+are regular expressions that match themselves.
+Any meta-character with special meaning
+may be quoted by preceding it with a backslash.
.PP
+The period
+.B .\&
+matches any single character.
+.SS "Character Classes and Bracket Expressions"
A
.I "bracket expression"
is a list of characters enclosed by
@@ -549,33 +673,25 @@ except the latter form depends upon the C locale and the
of locale and character set.
(Note that the brackets in these class names are part of the symbolic
names, and must be included in addition to the brackets delimiting
-the bracket list.) Most metacharacters lose their special meaning
-inside lists. To include a literal
+the bracket expression.)
+Most meta-characters lose their special meaning inside bracket expressions.
+To include a literal
.B ]
-place it first in the list. Similarly, to include a literal
+place it first in the list.
+Similarly, to include a literal
.B ^
-place it anywhere but first. Finally, to include a literal
+place it anywhere but first.
+Finally, to include a literal
.B \-
place it last.
-.PP
-The period
-.B .
-matches any single character.
-The symbol
-.B \ew
-is a synonym for
-.B [[:alnum:]]
-and
-.B \eW
-is a synonym for
-.BR [^[:alnum:]] .
-.PP
+.SS Anchoring
The caret
.B ^
and the dollar sign
.B $
-are metacharacters that respectively match the empty string at the
+are meta-characters that respectively match the empty string at the
beginning and end of a line.
+.SS "The Backslash Character and Special Expressions"
The symbols
.B \e<
and
@@ -589,7 +705,15 @@ and
matches the empty string provided it's
.I not
at the edge of a word.
-.PP
+The symbol
+.B \ew
+is a synonym for
+.B [[:alnum:]]
+and
+.B \eW
+is a synonym for
+.BR [^[:alnum:]] .
+.SS Repetition
A regular expression may be followed by one of several repetition operators:
.PD 0
.TP
@@ -612,6 +736,11 @@ The preceding item is matched
.I n
or more times.
.TP
+.BI {, m }
+The preceding item is matched at most
+.I m
+times.
+.TP
.BI { n , m }
The preceding item is matched at least
.I n
@@ -619,22 +748,23 @@ times, but not more than
.I m
times.
.PD
-.PP
+.SS Concatenation
Two regular expressions may be concatenated; the resulting
regular expression matches any string formed by concatenating
two substrings that respectively match the concatenated
-subexpressions.
-.PP
+expressions.
+.SS Alternation
Two regular expressions may be joined by the infix operator
.BR | ;
the resulting regular expression matches any string matching
-either subexpression.
-.PP
+either alternate expression.
+.SS Precedence
Repetition takes precedence over concatenation, which in turn
-takes precedence over alternation. A whole subexpression may be
-enclosed in parentheses to override these precedence rules.
-.PP
-The backreference
+takes precedence over alternation.
+A whole expression may be enclosed in parentheses
+to override these precedence rules and form a subexpression.
+.SS "Back References and Subexpressions"
+The back-reference
.BI \e n\c
\&, where
.I n
@@ -642,8 +772,8 @@ is a single digit, matches the substring
previously matched by the
.IR n th
parenthesized subexpression of the regular expression.
-.PP
-In basic regular expressions the metacharacters
+.SS "Basic vs Extended Regular Expressions"
+In basic regular expressions the meta-characters
.BR ? ,
.BR + ,
.BR { ,
@@ -665,7 +795,7 @@ Traditional
.B egrep
did not support the
.B {
-metacharacter, and some
+meta-character, and some
.B egrep
implementations support
.B \e{
@@ -683,19 +813,21 @@ to match a literal
attempts to support traditional usage by assuming that
.B {
is not special if it would be the start of an invalid interval
-specification. For example, the shell command
+specification.
+For example, the command
.B "grep\ \-E\ '{1'"
searches for the two-character string
.B {1
instead of reporting a syntax error in the regular expression.
\s-1POSIX.2\s0 allows this behavior as an extension, but portable scripts
should avoid it.
+.
.SH "ENVIRONMENT VARIABLES"
The behavior of
.B grep
is affected by the following environment variables.
.PP
-A locale
+The locale for category
.BI LC_ foo
is specified by examining the three environment variables
.BR LC_ALL ,
@@ -709,23 +841,24 @@ is not set, but
.B LC_MESSAGES
is set to
.BR pt_BR ,
-then Brazilian Portuguese is used for the
+then the Brazilian Portuguese locale is used for the
.B LC_MESSAGES
-locale.
+category.
The C locale is used if none of these environment variables are set,
-or if the locale catalog is not installed, or if
+if the locale catalog is not installed, or if
.B grep
was not compiled with national language support (\s-1NLS\s0).
.TP
.B GREP_OPTIONS
-This variable specifies default options to be placed in front of any
-explicit options. For example, if
+This variable specifies default options
+to be placed in front of any explicit options.
+For example, if
.B GREP_OPTIONS
is
.BR "'\-\^\-binary-files=without-match \-\^\-directories=skip'" ,
.B grep
behaves as if the two options
-.B \-\^\-binary-files=without-match
+.B \-\^\-binary\-files=without-match
and
.B \-\^\-directories=skip
had been specified before any explicit options.
@@ -734,53 +867,246 @@ A backslash escapes the next character,
so it can be used to specify an option containing whitespace or a backslash.
.TP
.B GREP_COLOR
-Deprecated in favor of
+This variable specifies the color used to highlight matched (non-empty) text.
+It is deprecated in favor of
.BR GREP_COLORS ,
-which has priority.
-It can only specify the marker for highlighting matched non-empty text and
-defaults to `01;31' (bold red).
-.TP
+but still supported.
+The
+.BR mt ,
+.BR ms ,
+and
+.B mc
+capabilities of
.B GREP_COLORS
-Specifies the markers for highlighting matched non-empty text (mt),
-matching lines (ml), context lines (cx), file names (fn), line numbers
-(ln), octet offsets (bn), and separators (se, for fields and groups of
-context lines).
-It is a colon-separated list of color specification assignments.
-The default is `mt=01;31:ml=:cx=:fn=35:ln=32:bn=32:se=36' which means
-bold red, default, default, magenta, green, green, and cyan, all text
-foregrounds on the default background.
-Note that the `ml' setting, if any, remains in effect just before the
-`mt' setting kicks in.
-See the Select Graphic Rendition (SGR, for character attributes)
-section in the documentation of the text terminal that is used
-for permissible values (semicolon-separated lists of integers)
-and their meaning.
+have priority over it.
+It can only specify the color used to highlight
+the matching non-empty text in any matching line
+(a selected line when the
+.B -v
+command-line option is omitted,
+or a context line when
+.B -v
+is specified).
+The default is
+.BR 01;31 ,
+which means a bold red foreground text on the terminal's default background.
+.TP
.B GREP_COLORS
-also supports a boolean `ne' capability (with no `=...' part) to not
-clear to the end of line using Erase in Line (EL) to Right (`\\33[K')
-each time a colorized item ends (needed on terminals on which EL is not
-supported; otherwise useful on terminals where the `back_color_erase'
-(`bce') boolean terminfo capability is not specified, when the chosen
-highlight colors do not affect the background, or when EL is too slow
-to bother doing or causes too much flicker).
+Specifies the colors and other attributes
+used to highlight various parts of the output.
+Its value is a colon-separated list of capabilities
+that defaults to
+.B ms=01;31:mc=01;31:sl=:cx=:fn=35:ln=32:bn=32:se=36
+with the
+.B rv
+and
+.B ne
+boolean capabilities omitted (i.e., false).
+Supported capabilities are as follows.
+.RS
+.TP
+.B sl=
+SGR substring for whole selected lines
+(i.e.,
+matching lines when the
+.B \-v
+command-line option is omitted,
+or non-matching lines when
+.B \-v
+is specified).
+If however the boolean
+.B rv
+capability
+and the
+.B \-v
+command-line option are both specified,
+it applies to context matching lines instead.
+The default is empty (i.e., the terminal's default color pair).
+.TP
+.B cx=
+SGR substring for whole context lines
+(i.e.,
+non-matching lines when the
+.B \-v
+command-line option is omitted,
+or matching lines when
+.B \-v
+is specified).
+If however the boolean
+.B rv
+capability
+and the
+.B \-v
+command-line option are both specified,
+it applies to selected non-matching lines instead.
+The default is empty (i.e., the terminal's default color pair).
+.TP
+.B rv
+Boolean value that reverses (swaps) the meanings of
+the
+.B sl=
+and
+.B cx=
+capabilities
+when the
+.B \-v
+command-line option is specified.
+The default is false (i.e., the capability is omitted).
+.TP
+.B mt=01;31
+SGR substring for matching non-empty text in any matching line
+(i.e.,
+a selected line when the
+.B \-v
+command-line option is omitted,
+or a context line when
+.B \-v
+is specified).
+Setting this is equivalent to setting both
+.B ms=
+and
+.B mc=
+at once to the same value.
+The default is a bold red text foreground over the current line background.
+.TP
+.B ms=01;31
+SGR substring for matching non-empty text in a selected line.
+(This is only used when the
+.B \-v
+command-line option is omitted.)
+The effect of the
+.B sl=
+(or
+.B cx=
+if
+.BR rv )
+capability remains active when this kicks in.
+The default is a bold red text foreground over the current line background.
+.TP
+.B mc=01;31
+SGR substring for matching non-empty text in a context line.
+(This is only used when the
+.B \-v
+command-line option is specified.)
+The effect of the
+.B cx=
+(or
+.B sl=
+if
+.BR rv )
+capability remains active when this kicks in.
+The default is a bold red text foreground over the current line background.
+.TP
+.B fn=35
+SGR substring for file names prefixing any content line.
+The default is a magenta text foreground over the terminal's default background.
+.TP
+.B ln=32
+SGR substring for line numbers prefixing any content line.
+The default is a green text foreground over the terminal's default background.
+.TP
+.B bn=32
+SGR substring for byte offsets prefixing any content line.
+The default is a green text foreground over the terminal's default background.
+.TP
+.B se=36
+SGR substring for separators that are inserted
+between selected line fields
+.RB ( : ),
+between context line fields,
+.RB ( \- ),
+and between groups of adjacent lines when nonzero context is specified
+.RB ( \-\^\- ).
+The default is a cyan text foreground over the terminal's default background.
+.TP
+.B ne
+Boolean value that prevents clearing to the end of line
+using Erase in Line (EL) to Right
+.RB ( \\\\\\33[K )
+each time a colorized item ends.
+This is needed on terminals on which EL is not supported.
+It is otherwise useful on terminals
+for which the
+.B back_color_erase
+.RB ( bce )
+boolean terminfo capability does not apply,
+when the chosen highlight colors do not affect the background,
+or when EL is too slow or causes too much flicker.
+The default is false (i.e., the capability is omitted).
+.PP
+Note that boolean capabilities have no
+.BR = ...
+part.
+They are omitted (i.e., false) by default and become true when specified.
+.PP
+See the Select Graphic Rendition (SGR) section
+in the documentation of the text terminal that is used
+for permitted values and their meaning as character attributes.
+These substring values are integers in decimal representation
+and can be concatenated with semicolons.
+.B grep
+takes care of assembling the result
+into a complete SGR sequence
+.RB ( \\\\\\33[ ... m ).
+Common values to concatenate include
+.B 1
+for bold,
+.B 4
+for underline,
+.B 5
+for blink,
+.B 7
+for inverse,
+.B 39
+for default foreground color,
+.B 30
+to
+.B 37
+for foreground colors,
+.B 90
+to
+.B 97
+for 16-color mode foreground colors,
+.B 38;5;0
+to
+.B 38;5;255
+for 88-color and 256-color modes foreground colors,
+.B 49
+for default background color,
+.B 40
+to
+.B 47
+for background colors,
+.B 100
+to
+.B 107
+for 16-color mode background colors, and
+.B 48;5;0
+to
+.B 48;5;255
+for 88-color and 256-color modes background colors.
+.RE
.TP
\fBLC_ALL\fP, \fBLC_COLLATE\fP, \fBLANG\fP
-These variables specify the
+These variables specify the locale for the
.B LC_COLLATE
-locale, which determines the collating sequence used to interpret
-range expressions like
+category,
+which determines the collating sequence
+used to interpret range expressions like
.BR [a\-z] .
.TP
\fBLC_ALL\fP, \fBLC_CTYPE\fP, \fBLANG\fP
-These variables specify the
+These variables specify the locale for the
.B LC_CTYPE
-locale, which determines the type of characters, e.g., which
-characters are whitespace.
+category,
+which determines the type of characters,
+e.g., which characters are whitespace.
.TP
\fBLC_ALL\fP, \fBLC_MESSAGES\fP, \fBLANG\fP
-These variables specify the
+These variables specify the locale for the
.B LC_MESSAGES
-locale, which determines the language that
+category,
+which determines the language that
.B grep
uses for messages.
The default C locale uses American English messages.
@@ -822,9 +1148,9 @@ This behavior is available only with the \s-1GNU\s0 C library, and only
when
.B POSIXLY_CORRECT
is not set.
-.SH DIAGNOSTICS
-.PP
-Normally, exit status is 0 if selected lines are found and 1 otherwise.
+.
+.SH "EXIT STATUS"
+Normally, the exit status is 0 if selected lines are found and 1 otherwise.
But the exit status is 2 if an error occurred, unless the
.B \-q
or
@@ -840,21 +1166,67 @@ and
that the exit status in case of error be greater than 1;
it is therefore advisable, for the sake of portability,
to use logic that tests for this general condition
-instead of strict equality with 2.
-.SH BUGS
+instead of strict equality with\ 2.
+.
+.SH COPYRIGHT
+Copyright \(co
+1998, 1999, 2000, 2002, 2005
+Free Software Foundation, Inc.
.PP
+This is free software;
+see the source for copying conditions.
+There is NO warranty;
+not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+.
+.SH BUGS
+.SS "Reporting Bugs"
Email bug reports to
-.BR bug-grep@gnu.org .
-.PP
+.RB < bug\-grep@gnu.org >,
+a mailing list whose web page is
+.RB < http://lists.gnu.org/mailman/listinfo/bug\-grep >.
+.BR grep 's
+Savannah bug tracker is located at
+.RB < http://savannah.gnu.org/bugs/?group=grep >.
+.SS "Known Bugs"
Large repetition counts in the
.BI { n , m }
-construct may cause grep to use lots of memory.
+construct may cause
+.B grep
+to use lots of memory.
In addition,
certain other obscure regular expressions require exponential time
and space, and may cause
.B grep
to run out of memory.
.PP
-Backreferences are very slow, and may require exponential time.
+Back-references are very slow, and may require exponential time.
+.
+.SH "SEE ALSO"
+.SS "Regular Manual Pages"
+awk(1), cmp(1), diff(1), find(1), gzip(1),
+perl(1), sed(1), sort(1), xargs(1), zgrep(1),
+mmap(2), read(2),
+pcre(3), pcrepattern(3),
+terminfo(5),
+glob(7), regex(7).
+.SS "\s-1POSIX\s0 Programmer's Manual Page"
+grep(1p).
+.SS "\*(Txinfo Documentation"
+The full documentation for
+.B grep
+is maintained as a \*(Txinfo manual.
+If the
+.B info
+and
+.B grep
+programs are properly installed at your site, the command
+.IP
+.B info grep
+.PP
+should give you access to the complete manual.
+.
+.SH NOTES
+\s-1GNU\s0's not Unix, but Unix is a beast;
+its plural form is Unixen.
.\" Work around problems with some troff -man implementations.
.br
diff --git a/doc/grep.texi b/doc/grep.texi
index dfdbb02a..d1c329fe 100644
--- a/doc/grep.texi
+++ b/doc/grep.texi
@@ -26,17 +26,17 @@
@ifinfo
@direntry
-* grep: (grep). print lines matching a pattern.
+* grep: (grep). print lines matching a pattern.
@end direntry
This file documents @command{grep}, a pattern matching engine.
Published by the Free Software Foundation,
-51 Franklin Street - Fifth Floor
-Boston, MA 02110-1301, USA
+51 Franklin Street -- Fifth Floor
+Boston, MA 02110--1301, USA
@c man begin COPYRIGHT
-Copyright @copyright{} 2000, 2001 Free Software Foundation, Inc.
+Copyright @copyright{} 1999, 2000, 2001, 2002, 2005 Free Software Foundation, Inc.
Permission is granted to make and distribute verbatim copies of
this manual provided the copyright notice and this permission notice
@@ -62,18 +62,18 @@ entitled ``GNU Free Documentation License'' (@pxref{Copying}).
@setchapternewpage off
@titlepage
-@title grep, searching for a pattern
+@title @command{grep}, print lines matching a pattern
@subtitle version @value{VERSION}, @value{UPDATED}
@author Alain Magloire et al.
@page
@vskip 0pt plus 1filll
-Copyright @copyright{} 2000, 2001 Free Software Foundation, Inc.
+Copyright @copyright{} 1999, 2000, 2001, 2002, 2005 Free Software Foundation, Inc.
@sp 2
Published by the Free Software Foundation, @*
-51 Franklin Street - Fifth Floor, @*
-Boston, MA 02110-1301, USA
+51 Franklin Street -- Fifth Floor, @*
+Boston, MA 02110--1301, USA
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.1 or
@@ -89,24 +89,24 @@ entitled ``GNU Free Documentation License''.
@node Top
@top grep
-The @command{grep} command searches for lines matching a pattern.
+@command{grep} searches for lines matching a pattern.
-This document was produced for version @value{VERSION} of @sc{gnu}
-@command{grep}.
+This document was produced
+for version @value{VERSION}
+of @sc{gnu} @command{grep}.
@end ifnottex
@menu
* Introduction:: Introduction.
* Invoking:: Invoking @command{grep}; description of options.
-* Diagnostics:: Exit status returned by @command{grep}.
+* Exit Status:: Exit status returned by @command{grep}.
* grep Programs:: @command{grep} programs.
* Regular Expressions:: Regular Expressions.
* Usage:: Examples.
* Reporting Bugs:: Reporting Bugs.
* Copying:: License terms.
-* Concept Index:: A menu with all the topics in this manual.
-* Index:: A menu with all @command{grep} commands
- and command-line options.
+* Concept Index:: Topics covered in this manual.
+* Index:: Options, environment variables, and constructs.
@end menu
@@ -115,83 +115,264 @@ This document was produced for version @value{VERSION} of @sc{gnu}
@cindex Searching for a pattern.
-The @command{grep} command searches the input files
-for lines containing a match to a given
-pattern list. When it finds a match in a line, it copies the line to standard
-output (by default), or does whatever other sort of output you have requested
-with options.
+@command{grep} searches the input files
+for lines containing a match to a given pattern list.
+When it finds a match in a line,
+it copies the line to standard output (by default),
+or does whatever other sort of output you have requested with options.
Though @command{grep} expects to do the matching on text,
it has no limits on input line length other than available memory,
and it can match arbitrary characters within a line.
If the final byte of an input file is not a newline,
@command{grep} silently supplies one.
-Since newline is also a separator for the list of patterns, there
-is no way to match newline characters in a text.
+Since newline is also a separator for the list of patterns,
+there is no way to match newline characters in a text.
@node Invoking
@chapter Invoking @command{grep}
-@command{grep} comes with a rich set of options from @sc{posix.2} and @sc{gnu}
-extensions.
+The general synopsis of a @command{grep} command line is
+
+@example
+grep @var{options} @var{pattern} @var{input_file_names}
+@end example
+
+@noindent
+There can be zero or more @var{options}.
+@var{pattern} will only be seen as such
+(and not as an @var{input_file_name})
+if it wasn't already specified within @var{options}
+(by using the @samp{-e@ @var{pattern}}
+or @samp{-f@ @var{file}} options).
+and there can be zero or more @var{input_file_names}.
+
+@menu
+* Command-line Options:: Short and long names, grouped by category.
+* Environment Variables:: POSIX, GNU generic, and GNU grep specific.
+@end menu
+
+@node Command-line Options
+@section Command-line Options
+
+@command{grep} comes with a rich set of options:
+some from @sc{posix.2} and some being @sc{gnu} extensions.
+Long option names are always a @sc{gnu} extension,
+even for options that are from @sc{posix} specifications.
+Options that are specified by @sc{posix},
+under their short names,
+are explicitly marked as such
+to facilitate @sc{posix}-portable programming.
+A few option names are provided
+for compatibility with older or more exotic implementations.
+
+@menu
+* Generic Program Information::
+* Matching Control::
+* General Output Control::
+* Output Line Prefix Control::
+* Context Line Control::
+* File and Directory Selection::
+* Other Options::
+@end menu
+
+Several additional options control
+which variant of the @command{grep} matching engine is used.
+@xref{grep Programs}.
+
+@node Generic Program Information
+@subsection Generic Program Information
@table @samp
-@item -c
-@itemx --count
-@opindex -c
-@opindex --count
-@cindex counting lines
-Suppress normal output; instead print a count of matching
-lines for each input file. With the @samp{-v}, @samp{--invert-match} option,
-count non-matching lines.
+@item --help
+@opindex --help
+@cindex Usage summary, printing
+Print a usage message briefly summarizing these command-line options
+and the bug-reporting address, then exit.
+
+@item -V
+@itemx --version
+@opindex -V
+@opindex --version
+@cindex Version, printing
+Print the version number of @command{grep} to the standard output stream.
+This version number should be included in all bug reports.
+
+@end table
+
+@node Matching Control
+@subsection Matching Control
+
+@table @samp
@item -e @var{pattern}
@itemx --regexp=@var{pattern}
@opindex -e
@opindex --regexp=@var{pattern}
@cindex pattern list
-Use @var{pattern} as the pattern; useful to protect patterns
-beginning with a @samp{-}.
+Use @var{pattern} as the pattern;
+useful to protect patterns beginning with a @samp{-}.
+(@samp{-e} is specified by @sc{posix}.)
@item -f @var{file}
@itemx --file=@var{file}
@opindex -f
@opindex --file
@cindex pattern from file
-Obtain patterns from @var{file}, one per line. The empty
-file contains zero patterns, and therefore matches nothing.
+Obtain patterns from @var{file}, one per line.
+The empty file contains zero patterns, and therefore matches nothing.
+(@samp{-f} is specified by @sc{posix}.)
@item -i
+@itemx -y
@itemx --ignore-case
@opindex -i
+@opindex -y
@opindex --ignore-case
@cindex case insensitive search
Ignore case distinctions in both the pattern and the input files.
+@samp{-y} is an obsolete synonym that is provided for compatibility.
+(@samp{-i} is specified by @sc{posix}.)
+
+@item -v
+@itemx --invert-match
+@opindex -v
+@opindex --invert-match
+@cindex invert matching
+@cindex print non-matching lines
+Invert the sense of matching, to select non-matching lines.
+(@samp{-v} is specified by @sc{posix}.)
+
+@item -w
+@itemx --word-regexp
+@opindex -w
+@opindex --word-regexp
+@cindex matching whole words
+Select only those lines containing matches that form whole words.
+The test is that the matching substring must either
+be at the beginning of the line,
+or preceded by a non-word constituent character.
+Similarly,
+it must be either at the end of the line
+or followed by a non-word constituent character.
+Word-constituent characters are letters, digits, and the underscore.
+
+@item -x
+@itemx --line-regexp
+@opindex -x
+@opindex --line-regexp
+@cindex match the whole line
+Select only those matches that exactly match the whole line.
+(@samp{-x} is specified by @sc{posix}.)
+
+@end table
+
+@node General Output Control
+@subsection General Output Control
+
+@table @samp
+
+@item -c
+@itemx --count
+@opindex -c
+@opindex --count
+@cindex counting lines
+Suppress normal output;
+instead print a count of matching lines for each input file.
+With the @samp{-v}, @samp{--invert-match} option,
+count non-matching lines.
+(@samp{-c} is specified by @sc{posix}.)
+
+@item --color[=@var{WHEN}]
+@itemx --colour[=@var{WHEN}]
+@opindex --color
+@opindex --colour
+@cindex highlight, color, colour
+Surround the matched (non-empty) strings, matching lines, context lines,
+file names, line numbers, byte offsets, and separators (for fields and
+groups of context lines) with escape sequences to display them in color
+on the terminal.
+The colors are defined by the environment variable @var{GREP_COLORS}
+and default to @samp{ms=01;31:mc=01;31:sl=:cx=:fn=35:ln=32:bn=32:se=36}
+for bold red matched text, magenta file names, green line numbers,
+green byte offsets, cyan separators, and default terminal colors otherwise.
+The deprecated environment variable @var{GREP_COLOR} is still supported,
+but its setting does not have priority;
+it defaults to `01;31' (bold red)
+which only covers the color for matched text.
+@var{WHEN} is @samp{never}, @samp{always}, or @samp{auto}.
+
+@item -L
+@itemx --files-without-match
+@opindex -L
+@opindex --files-without-match
+@cindex files which don't match
+Suppress normal output;
+instead print the name of each input file from which
+no output would normally have been printed.
+The scanning of every file will stop on the first match.
@item -l
@itemx --files-with-matches
@opindex -l
@opindex --files-with-matches
@cindex names of matching files
-Suppress normal output; instead print the name of each input
-file from which output would normally have been printed.
+Suppress normal output;
+instead print the name of each input file from which
+output would normally have been printed.
The scanning of every file will stop on the first match.
+(@samp{-l} is specified by @sc{posix}.)
-@item -n
-@itemx --line-number
-@opindex -n
-@opindex --line-number
-@cindex line numbering
-Prefix each line of output with the line number within its input file.
+@item -m @var{num}
+@itemx --max-count=@var{num}
+@opindex -m
+@opindex --max-count
+@cindex max-count
+Stop reading a file after @var{num} matching lines.
+If the input is standard input from a regular file,
+and @var{num} matching lines are output,
+@command{grep} ensures that the standard input is positioned
+just after the last matching line before exiting,
+regardless of the presence of trailing context lines.
+This enables a calling process to resume a search.
+For example, the following shell script makes use of it:
+
+@example
+while grep -m 1 PATTERN
+do
+ echo xxxx
+done < FILE
+@end example
+
+But the following probably will not work because a pipe is not a regular
+file:
+
+@example
+# This probably will not work.
+cat FILE |
+while grep -m 1 PATTERN
+do
+ echo xxxx
+done
+@end example
+
+When @command{grep} stops after @var{num} matching lines,
+it outputs any trailing context lines.
+Since context does not include matching lines,
+@command{grep} will stop when it encounters another matching line.
+When the @samp{-c} or @samp{--count} option is also used,
+@command{grep} does not output a count greater than @var{num}.
+When the @samp{-v} or @samp{--invert-match} option is also used,
+@command{grep} stops after outputting @var{num} non-matching lines.
@item -o
@itemx --only-matching
@opindex -o
@opindex --only-matching
@cindex only matching
-Print only the non-empty parts of matching lines
-that actually match @var{pattern}.
+Print only the matched (non-empty) parts of matching lines,
+with each such part on a separate output line.
@item -q
@itemx --quiet
@@ -200,9 +381,11 @@ that actually match @var{pattern}.
@opindex --quiet
@opindex --silent
@cindex quiet, silent
-Quiet; do not write anything to standard output. Exit immediately with
-zero status if any match is found, even if an error was detected. Also
-see the @samp{-s} or @samp{--no-messages} option.
+Quiet; do not write anything to standard output.
+Exit immediately with zero status if any match is found,
+even if an error was detected.
+Also see the @samp{-s} or @samp{--no-messages} option.
+(@samp{-q} is specified by @sc{posix}.)
@item -s
@itemx --no-messages
@@ -210,33 +393,136 @@ see the @samp{-s} or @samp{--no-messages} option.
@opindex --no-messages
@cindex suppress error messages
Suppress error messages about nonexistent or unreadable files.
-Portability note: unlike @sc{gnu} @command{grep}, 7th Edition Unix
-@command{grep} did not conform to @sc{posix}, because it
-lacked @samp{-q} and its @samp{-s} option behaved like @sc{gnu}
-@command{grep}'s @samp{-q} option. @sc{usg}-style @command{grep}
-also lacked @samp{-q} but its @samp{-s} option behaved like
-@sc{gnu} @command{grep}. Portable shell scripts should avoid both
+Portability note:
+unlike @sc{gnu} @command{grep},
+7th Edition Unix @command{grep} did not conform to @sc{posix},
+because it lacked @samp{-q}
+and its @samp{-s} option behaved like
+@sc{gnu} @command{grep}'s @samp{-q} option.
+@sc{usg}-style @command{grep} also lacked @samp{-q}
+but its @samp{-s} option behaved like @sc{gnu} @command{grep}.
+Portable shell scripts should avoid both
@samp{-q} and @samp{-s} and should redirect
standard and error output to @file{/dev/null} instead.
+(@samp{-s} is specified by @sc{posix}.)
-@item -v
-@itemx --invert-match
-@opindex -v
-@opindex --invert-match
-@cindex invert matching
-@cindex print non-matching lines
-Invert the sense of matching, to select non-matching lines.
+@end table
-@item -x
-@itemx --line-regexp
-@opindex -x
-@opindex --line-regexp
-@cindex match the whole line
-Select only those matches that exactly match the whole line.
+@node Output Line Prefix Control
+@subsection Output Line Prefix Control
+
+When several prefix fields are to be output,
+the order is always file name, line number, and byte offset,
+regardless of the order in which these options were specified.
+
+@table @samp
+
+@item -b
+@itemx --byte-offset
+@opindex -b
+@opindex --byte-offset
+@cindex byte offset
+Print the 0-based byte offset within the input file
+before each line of output.
+If @samp{-o} (@samp{--only-matching}) is specified,
+print the offset of the matching part itself.
+When @command{grep} runs on @sc{ms-dos} or @sc{ms}-Windows,
+the printed byte offsets depend on whether
+the @samp{-u} (@samp{--unix-byte-offsets}) option is used;
+see below.
+
+@item -H
+@itemx --with-filename
+@opindex -H
+@opindex --with-filename
+@cindex with filename prefix
+Print the file name for each match.
+This is the default when there is more than one file to search.
+
+@item -h
+@itemx --no-filename
+@opindex -h
+@opindex --no-filename
+@cindex no filename prefix
+Suppress the prefixing of file names on output.
+This is the default when there is only one file
+(or only standard input) to search.
+
+@item --label=@var{LABEL}
+@opindex --label
+@cindex changing name of standard input
+Display input actually coming from standard input
+as input coming from file @var{LABEL}.
+This is especially useful for tools like @command{zgrep};
+e.g.:
+
+@example
+gzip -cd foo.gz | grep --label=foo something
+@end example
+
+@item -n
+@itemx --line-number
+@opindex -n
+@opindex --line-number
+@cindex line numbering
+Prefix each line of output with the 1-based line number within its input file.
+(@samp{-n} is specified by @sc{posix}.)
+
+@item -T
+@itemx --initial-tab
+@opindex -T
+@opindex --initial-tab
+@cindex tab-aligned content lines
+Make sure that the first character of actual line content lies on a tab stop,
+so that the alignment of tabs looks normal.
+This is useful with options that prefix their output to the actual content:
+@samp{-H}, @samp{-n}, and @samp{-b}.
+In order to improve the probability that lines
+from a single file will all start at the same column,
+this also causes the line number and byte offset (if present)
+to be printed in a minimum size field width.
+
+@item -u
+@itemx --unix-byte-offsets
+@opindex -u
+@opindex --unix-byte-offsets
+@cindex @sc{ms-dos}/@sc{ms}-Windows byte offsets
+@cindex byte offsets, on @sc{ms-dos}/@sc{ms}-Windows
+Report Unix-style byte offsets.
+This option causes @command{grep} to report byte offsets
+as if the file were a Unix-style text file,
+i.e., the byte offsets ignore the @code{CR} characters that were stripped.
+This will produce results identical
+to running @command{grep} on a Unix machine.
+This option has no effect unless @samp{-b} option is also used;
+it has no effect on platforms other than @sc{ms-dos} and @sc{ms}-Windows.
+
+@item -Z
+@itemx --null
+@opindex -Z
+@opindex --null
+@cindex zero-terminated file names
+Output a zero byte (the @sc{ascii} @code{NUL} character)
+instead of the character that normally follows a file name.
+For example,
+@samp{grep -lZ} outputs a zero byte after each file name
+instead of the usual newline.
+This option makes the output unambiguous,
+even in the presence of file names containing unusual characters like newlines.
+This option can be used with commands like
+@samp{find -print0}, @samp{perl -0}, @samp{sort -z}, and @samp{xargs -0}
+to process arbitrary file names,
+even those that contain newline characters.
@end table
-@section @sc{gnu} Extensions
+@node Context Line Control
+@subsection Context Line Control
+
+Regardless of how these options are set,
+@command{grep} will never print any given line more than once.
+If the @samp{-o} or @samp{--only-matching} option is specified,
+these options have no effect and a warning is given upon their use.
@table @samp
@@ -257,86 +543,75 @@ Print @var{num} lines of trailing context after matching lines.
Print @var{num} lines of leading context before matching lines.
@item -C @var{num}
+@itemx -@var{num}
@itemx --context=@var{num}
@opindex -C
@opindex --context
+@opindex -@var{num}
@cindex context
-Print @var{num} lines of output context.
+Print @var{num} lines of leading and trailing output context.
-@item --color[=@var{WHEN}]
-@itemx --colour[=@var{WHEN}]
-@opindex --color
-@opindex --colour
-@cindex highlight, color, colour
-Surround the matching non-empty strings, matching lines, context lines,
-file names, line numbers, octet offsets, and separators (for fields and
-groups of context lines) with escape sequences to display them in color
-on the terminal.
-The colors are defined by the environment variable @var{GREP_COLORS}
-and default to `mt=01;31:ml=:cx=:fn=35:ln=32:bn=32:se=36' for bold red
-matched text, magenta file names, green line numbers, green octet offsets,
-cyan separators, and default terminal colors otherwise.
-The deprecated environment variable @var{GREP_COLOR} is still supported,
-but its setting does not have priority; it defaults to `01;31' (bold red)
-which only covers the color for matched text.
-@var{WHEN} is `never', `always', or `auto'.
+@end table
-@item -@var{num}
-@opindex -NUM
-Same as @samp{--context=@var{num}} lines of leading and trailing
-context. However, grep will never print any given line more than once.
+Matching lines normally use @samp{:} as a separator
+between prefix fields and actual line content.
+Context (i.e., non-matching) lines use @samp{-} instead.
+When no context is specified,
+matching lines are simply output one right after another.
+When nonzero context is specified,
+lines that are adjacent in the input form a group
+and are output one right after another,
+but disjoint groups of lines are separated by a @samp{--}
+without any prefix and on a line of its own.
+Each group may contain several matching lines
+when they are close enough to each other
+that two otherwise adjacent but divided groups connect
+and can just merge into a single contiguous one.
+
+@node File and Directory Selection
+@subsection File and Directory Selection
-@item -V
-@itemx --version
-@opindex -V
-@opindex --version
-@cindex Version, printing
-Print the version number of @command{grep} to the standard output stream.
-This version number should be included in all bug reports.
+@table @samp
-@item --help
-@opindex --help
-@cindex Usage summary, printing
-Print a usage message briefly summarizing these command-line options
-and the bug-reporting address, then exit.
+@item -a
+@itemx --text
+@opindex -a
+@opindex --text
+@cindex suppress binary data
+@cindex binary files
+Process a binary file as if it were text;
+this is equivalent to the @samp{--binary-files=text} option.
@itemx --binary-files=@var{type}
@opindex --binary-files
@cindex binary files
-If the first few bytes of a file indicate that the file contains binary
-data, assume that the file is of type @var{type}. By default,
-@var{type} is @samp{binary}, and @command{grep} normally outputs either
-a one-line message saying that a binary file matches, or no message if
-there is no match. If @var{type} is @samp{without-match},
+If the first few bytes of a file indicate that the file contains binary data,
+assume that the file is of type @var{type}.
+By default, @var{type} is @samp{binary},
+and @command{grep} normally outputs either
+a one-line message saying that a binary file matches,
+or no message if there is no match.
+If @var{type} is @samp{without-match},
@command{grep} assumes that a binary file does not match;
-this is equivalent to the @samp{-I} option. If @var{type}
-is @samp{text}, @command{grep} processes a binary file as if it were
-text; this is equivalent to the @samp{-a} option.
+this is equivalent to the @samp{-I} option.
+If @var{type} is @samp{text},
+@command{grep} processes a binary file as if it were text;
+this is equivalent to the @samp{-a} option.
@emph{Warning:} @samp{--binary-files=text} might output binary garbage,
-which can have nasty side effects if the output is a terminal and if the
-terminal driver interprets some of it as commands.
-
-@item -b
-@itemx --byte-offset
-@opindex -b
-@opindex --byte-offset
-@cindex byte offset
-Print the byte offset within the input file before each line of output.
-When @command{grep} runs on @sc{ms-dos} or MS-Windows, the printed
-byte offsets
-depend on whether the @samp{-u} (@samp{--unix-byte-offsets}) option is
-used; see below.
+which can have nasty side effects
+if the output is a terminal and
+if the terminal driver interprets some of it as commands.
@item -D @var{action}
@itemx --devices=@var{action}
@opindex -D
@opindex --devices
@cindex device search
-If an input file is a device, FIFO or socket, use @var{action} to process it.
-By default, @var{action} is @samp{read}, which means that devices are
-read just as if they were ordinary files.
-If @var{action} is @samp{skip}, devices, FIFOs and sockets are silently
-skipped.
+If an input file is a device, FIFO, or socket, use @var{action} to process it.
+By default, @var{action} is @samp{read},
+which means that devices are read just as if they were ordinary files.
+If @var{action} is @samp{skip},
+devices, FIFOs, and sockets are silently skipped.
@item -d @var{action}
@itemx --directories=@var{action}
@@ -344,74 +619,44 @@ skipped.
@opindex --directories
@cindex directory search
If an input file is a directory, use @var{action} to process it.
-By default, @var{action} is @samp{read}, which means that directories are
-read just as if they were ordinary files (some operating systems
-and filesystems disallow this, and will cause @command{grep} to print error
-messages for every directory or silently skip them). If @var{action} is
-@samp{skip}, directories are silently skipped. If @var{action} is
-@samp{recurse}, @command{grep} reads all files under each directory,
-recursively; this is equivalent to the @samp{-r} option.
-
-@item -H
-@itemx --with-filename
-@opindex -H
-@opindex --With-filename
-@cindex with filename prefix
-Print the filename for each match.
-
-@item -h
-@itemx --no-filename
-@opindex -h
-@opindex --no-filename
-@cindex no filename prefix
-Suppress the prefixing of filenames on output when multiple files are searched.
-
-@item --line-buffered
-@opindex --line-buffered
-@cindex line buffering
-Set the line buffering policy, this can be a performance penalty.
-
-@item --label=@var{LABEL}
-@opindex --label
-@cindex changing name of standard input
-Display input actually coming from standard input as input coming from file
-@var{LABEL}. This is especially useful for tools like @command{zgrep}, e.g.,
-@command{gzip -cd foo.gz | grep --label=foo something}
-
-@item -L
-@itemx --files-without-match
-@opindex -L
-@opindex --files-without-match
-@cindex files which don't match
-Suppress normal output; instead print the name of each input
-file from which no output would normally have been printed.
-The scanning of every file will stop on the first match.
-
-@item -a
-@itemx --text
-@opindex -a
-@opindex --text
-@cindex suppress binary data
-@cindex binary files
-Process a binary file as if it were text; this is equivalent to the
-@samp{--binary-files=text} option.
+By default, @var{action} is @samp{read},
+which means that directories are read just as if they were ordinary files
+(some operating systems and file systems disallow this,
+and will cause @command{grep}
+to print error messages for every directory or silently skip them).
+If @var{action} is @samp{skip}, directories are silently skipped.
+If @var{action} is @samp{recurse},
+@command{grep} reads all files under each directory, recursively;
+this is equivalent to the @samp{-r} option.
+
+@item --exclude=@var{glob}
+@opindex --exclude
+@cindex exclude files
+@cindex searching directory trees
+Skip files whose base name matches @var{glob}
+(using wildcard matching).
+A file-name glob can use
+@samp{*}, @samp{?}, and @samp{[}...@samp{]} as wildcards,
+and @code{\} to quote a wildcard or backslash character literally.
+
+@item --exclude-from=@var{file}
+@opindex --exclude-from
+@cindex exclude files
+@cindex searching directory trees
+Skip files @emph{and directories} whose base name matches
+any of the file-name globs read from @var{file}
+(using wildcard matching as described under @samp{--exclude}).
@item -I
-Process a binary file as if it did not contain matching data; this is
-equivalent to the @samp{--binary-files=without-match} option.
+Process a binary file as if it did not contain matching data;
+this is equivalent to the @samp{--binary-files=without-match} option.
-@item -w
-@itemx --word-regexp
-@opindex -w
-@opindex --word-regexp
-@cindex matching whole words
-Select only those lines containing matches that form
-whole words. The test is that the matching substring
-must either be at the beginning of the line, or preceded
-by a non-word constituent character. Similarly,
-it must be either at the end of the line or followed by
-a non-word constituent character. Word-constituent
-characters are letters, digits, and the underscore.
+@item --include=@var{glob}
+@opindex --include
+@cindex include files
+@cindex searching directory trees
+Search only files whose base name matches @var{glob}
+(using wildcard matching as described under @samp{--exclude}).
@item -r
@itemx -R
@@ -420,139 +665,52 @@ characters are letters, digits, and the underscore.
@opindex --recursive
@cindex recursive search
@cindex searching directory trees
-For each directory mentioned in the command line, read and process all
-files in that directory, recursively. This is the same as the
-@samp{--directories=recurse} option.
+For each directory mentioned in the command line,
+read and process all files in that directory, recursively.
+This is the same as the @samp{--directories=recurse} option.
-@item --include=@var{file_pattern}
-@opindex --include
-@cindex include files
-@cindex searching directory trees
-Search only files matching @var{file_pattern}.
-
-@item --exclude=@var{file_pattern}
-@opindex --exclude
-@cindex exclude files
-@cindex searching directory trees
-Skip files @emph{and directories} matching @var{file_pattern}.
-
-@item -m @var{num}
-@itemx --max-count=@var{num}
-@opindex -m
-@opindex --max-count
-@cindex max-count
-Stop reading a file after @var{num} matching lines. If the input is
-standard input from a regular file, and @var{num} matching lines are
-output, @command{grep} ensures that the standard input is positioned to
-just after the last matching line before exiting, regardless of the
-presence of trailing context lines. This enables a calling process
-to resume a search. For example, the following shell script makes use
-of it:
-
-@example
-while grep -m 1 PATTERN
-do
- echo xxxx
-done < FILE
-@end example
+@end table
-But the following probably will not work because a pipe is not a regular
-file:
+@node Other Options
+@subsection Other Options
-@example
-# This probably will not work.
-cat FILE |
-while grep -m 1 PATTERN
-do
- echo xxxx
-done
-@end example
+@table @samp
-When @command{grep} stops after NUM matching lines, it outputs
-any trailing context lines. Since context does not include matching
-lines, @command{grep} will stop when it encounters another matching line.
-When the @samp{-c} or @samp{--count} option is also used,
-@command{grep} does not output a count greater than @var{num}.
-When the @samp{-v} or @samp{--invert-match} option is
-also used, @command{grep} stops after outputting @var{num}
-non-matching lines.
+@item --line-buffered
+@opindex --line-buffered
+@cindex line buffering
+Use line buffering on output.
+This can cause a performance penalty.
-@item -y
-@opindex -y
-@cindex case insensitive search, obsolete option
-Obsolete synonym for @samp{-i}.
+@item --mmap
+@opindex --mmap
+@cindex memory mapped input
+If possible, use the @code{mmap} system call to read input,
+instead of the default @code{read} system call.
+In some situations, @samp{--mmap} yields better performance.
+However, @samp{--mmap} can cause undefined behavior (including core dumps)
+if an input file shrinks while @command{grep} is operating,
+or if an I/O error occurs.
@item -U
@itemx --binary
@opindex -U
@opindex --binary
-@cindex DOS/Windows binary files
-@cindex binary files, DOS/Windows
-Treat the file(s) as binary. By default, under @sc{ms-dos}
-and MS-Windows, @command{grep} guesses the file type by looking
-at the contents of the first 32kB read from the file.
-If @command{grep} decides the file is a text file, it strips the
-@code{CR} characters from the original file contents (to make
-regular expressions with @code{^} and @code{$} work correctly).
-Specifying @samp{-U} overrules this guesswork, causing all
-files to be read and passed to the matching mechanism
-verbatim; if the file is a text file with @code{CR/LF} pairs
-at the end of each line, this will cause some regular
-expressions to fail. This option has no effect on platforms other than
-@sc{ms-dos} and MS-Windows.
-
-@item -u
-@itemx --unix-byte-offsets
-@opindex -u
-@opindex --unix-byte-offsets
-@cindex DOS byte offsets
-@cindex byte offsets, on DOS/Windows
-Report Unix-style byte offsets. This switch causes
-@command{grep} to report byte offsets as if the file were Unix style
-text file, i.e., the byte offsets ignore the @code{CR} characters which were
-stripped. This will produce results identical to running @command{grep} on
-a Unix machine. This option has no effect unless @samp{-b}
-option is also used; it has no effect on platforms other than @sc{ms-dos} and
-MS-Windows.
-
-@item --mmap
-@opindex --mmap
-@cindex memory mapped input
-If possible, use the @code{mmap} system call to read input, instead of
-the default @code{read} system call. In some situations, @samp{--mmap}
-yields better performance. However, @samp{--mmap} can cause undefined
-behavior (including core dumps) if an input file shrinks while
-@command{grep} is operating, or if an I/O error occurs.
-
-@item -T
-@itemx --initial-tab
-@opindex -T
-@opindex --initial-tab
-@cindex tab-aligned content lines
-Makes sure that the first character of actual line content lies on a
-tab stop, so that the alignment of tabs looks normal.
-This is useful when combined with @samp{-H} (which is implicit when
-there is more than one file to search), @samp{-n}, and @samp{-b};
-these options prepend their output at the beginning of the displayed
-line, before the actual content.
-In order to improve the probability that all matched or context lines
-from a single file will all start at the same column, this also causes
-the line number and octet offset (if present) to be printed in a minimum
-size field width.
-
-@item -Z
-@itemx --null
-@opindex -Z
-@opindex --null
-@cindex zero-terminated file names
-Output a zero byte (the @sc{ascii} @code{NUL} character) instead of the
-character that normally follows a file name. For example, @samp{grep
--lZ} outputs a zero byte after each file name instead of the usual
-newline. This option makes the output unambiguous, even in the presence
-of file names containing unusual characters like newlines. This option
-can be used with commands like @samp{find -print0}, @samp{perl -0},
-@samp{sort -z}, and @samp{xargs -0} to process arbitrary file names,
-even those that contain newline characters.
+@cindex @sc{ms-dos}/@sc{ms}-Windows binary files
+@cindex binary files, @sc{ms-dos}/@sc{ms}-Windows
+Treat the file(s) as binary.
+By default, under @sc{ms-dos} and @sc{ms}-Windows,
+@command{grep} guesses the file type
+by looking at the contents of the first 32kB read from the file.
+If @command{grep} decides the file is a text file,
+it strips the @code{CR} characters from the original file contents
+(to make regular expressions with @code{^} and @code{$} work correctly).
+Specifying @samp{-U} overrules this guesswork,
+causing all files to be read and passed to the matching mechanism verbatim;
+if the file is a text file with @code{CR/LF} pairs at the end of each line,
+this will cause some regular expressions to fail.
+This option has no effect
+on platforms other than @sc{ms-dos} and @sc{ms}-Windows.
@item -z
@itemx --null-data
@@ -560,150 +718,272 @@ even those that contain newline characters.
@opindex --null-data
@cindex zero-terminated lines
Treat the input as a set of lines, each terminated by a zero byte (the
-@sc{ascii} @code{NUL} character) instead of a newline. Like the @samp{-Z}
-or @samp{--null} option, this option can be used with commands like
+@sc{ascii} @code{NUL} character) instead of a newline.
+Like the @samp{-Z} or @samp{--null} option,
+this option can be used with commands like
@samp{sort -z} to process arbitrary file names.
@end table
-Several additional options control which variant of the @command{grep}
-matching engine is used. @xref{grep Programs}.
-
+@node Environment Variables
@section Environment Variables
The behavior of @command{grep} is affected
by the following environment variables.
-A locale @code{LC_@var{foo}} is specified by examining the three
-environment variables @env{LC_ALL}, @env{LC_@var{foo}}, and @env{LANG},
-in that order. The first of these variables that is set specifies the
-locale. For example, if @env{LC_ALL} is not set, but @env{LC_MESSAGES}
-is set to @samp{pt_BR}, then Brazilian Portuguese is used for the
-@code{LC_MESSAGES} locale. The C locale is used if none of these
-environment variables are set, or if the locale catalog is not
-installed, or if @command{grep} was not compiled with national language
-support (@sc{nls}).
+The locale for category @w{@code{LC_@var{foo}}}
+is specified by examining the three environment variables
+@env{LC_ALL}, @w{@env{LC_@var{foo}}}, and @env{LANG},
+in that order.
+The first of these variables that is set specifies the locale.
+For example, if @env{LC_ALL} is not set,
+but @env{LC_MESSAGES} is set to @samp{pt_BR},
+then the Brazilian Portuguese locale is used
+for the @code{LC_MESSAGES} category.
+The @samp{C} locale is used if none of these environment variables are set,
+if the locale catalog is not installed,
+or if @command{grep} was not compiled
+with national language support (@sc{nls}).
@cindex environment variables
@table @env
@item GREP_OPTIONS
-@vindex GREP_OPTIONS
+@vindex GREP_OPTIONS @r{environment variable}
@cindex default options environment variable
This variable specifies default options to be placed in front of any
-explicit options. For example, if @code{GREP_OPTIONS} is
+explicit options.
+For example, if @code{GREP_OPTIONS} is
@samp{--binary-files=without-match --directories=skip}, @command{grep}
behaves as if the two options @samp{--binary-files=without-match} and
@samp{--directories=skip} had been specified before
-any explicit options. Option specifications are separated by
-whitespace. A backslash escapes the next character, so it can be used to
+any explicit options.
+Option specifications are separated by
+whitespace.
+A backslash escapes the next character, so it can be used to
specify an option containing whitespace or a backslash.
@item GREP_COLOR
-@vindex GREP_COLOR
+@vindex GREP_COLOR @r{environment variable}
@cindex highlight markers
-This variable is deprecated but still supported; its setting does not
-have priority over that of @code{GREP_COLORS}.
-It specifies the color used to highlight the matching non-empty text.
-The default is `01;31' which means bold red text on the default
-background.
+This variable specifies the color used to highlight matched (non-empty) text.
+It is deprecated in favor of @code{GREP_COLORS}, but still supported.
+The @samp{mt}, @samp{ms}, and @samp{mc} capabilities of @code{GREP_COLORS}
+have priority over it.
+It can only specify the color used to highlight
+the matching non-empty text in any matching line
+(a selected line when the @samp{-v} command-line option is omitted,
+or a context line when @samp{-v} is specified).
+The default is @samp{01;31},
+which means a bold red foreground text on the terminal's default background.
@item GREP_COLORS
-@vindex GREP_COLORS
+@vindex GREP_COLORS @r{environment variable}
@cindex highlight markers
-This variable specifies the colors used to highlight
-the matching non-empty text (mt), matching lines (ml), context lines (cx),
-file names (fn), line numbers (ln), octet offsets (bn),
-and separators (se, for fields and groups of context lines).
-It is a colon-separated list of color specification assignments.
-The default is `mt=01;31:ml=:cx=:fn=35:ln=32:bn=32:se=36' which means
-bold red, default, default, magenta, green, green, and cyan, all text
-foregrounds on the default background.
-Note that the `ml' setting, if any, remains in effect just before the
-`mt' setting kicks in.
-See the Select Graphic Rendition (SGR, for character attributes)
-section in the documentation of the text terminal that is used
-for permissible values (semicolon-separated lists of integers)
-and their meaning.
-@code{GREP_COLORS} also supports a boolean `ne' capability (with no
-`=...' part) to not clear to the end of line using Erase in Line (EL)
-to Right (`\33[K') each time a colorized item ends (needed on terminals
-on which EL is not supported; otherwise useful on terminals where the
-@code{back_color_erase} (@code{bce}) boolean terminfo capability is not
-specified, when the chosen highlight colors do not affect the background,
-or when EL is too slow to bother doing or causes too much flicker).
+This variable specifies the colors and other attributes
+used to highlight various parts of the output.
+Its value is a colon-separated list of capabilities
+that defaults to @samp{ms=01;31:mc=01;31:sl=:cx=:fn=35:ln=32:bn=32:se=36}
+with the @samp{rv} and @samp{ne} boolean capabilities omitted (i.e., false).
+Supported capabilities are as follows.
+
+@table @code
+@item sl=
+@vindex sl GREP_COLORS @r{capability}
+SGR substring for whole selected lines
+(i.e.,
+matching lines when the @samp{-v} command-line option is omitted,
+or non-matching lines when @samp{-v} is specified).
+If however the boolean @samp{rv} capability
+and the @samp{-v} command-line option are both specified,
+it applies to context matching lines instead.
+The default is empty (i.e., the terminal's default color pair).
+
+@item cx=
+@vindex cx GREP_COLORS @r{capability}
+SGR substring for whole context lines
+(i.e.,
+non-matching lines when the @samp{-v} command-line option is omitted,
+or matching lines when @samp{-v} is specified).
+If however the boolean @samp{rv} capability
+and the @samp{-v} command-line option are both specified,
+it applies to selected non-matching lines instead.
+The default is empty (i.e., the terminal's default color pair).
+
+@item rv
+@vindex rv GREP_COLORS @r{capability}
+Boolean value that reverses (swaps) the meanings of
+the @samp{sl=} and @samp{cx=} capabilities
+when the @samp{-v} command-line option is specified.
+The default is false (i.e., the capability is omitted).
+
+@item mt=01;31
+@vindex mt GREP_COLORS @r{capability}
+SGR substring for matching non-empty text in any matching line
+(i.e.,
+a selected line when the @samp{-v} command-line option is omitted,
+or a context line when @samp{-v} is specified).
+Setting this is equivalent to setting both @samp{ms=} and @samp{mc=}
+at once to the same value.
+The default is a bold red text foreground over the current line background.
+
+@item ms=01;31
+@vindex ms GREP_COLORS @r{capability}
+SGR substring for matching non-empty text in a selected line.
+(This is only used when the @samp{-v} command-line option is omitted.)
+The effect of the @samp{sl=} (or @samp{cx=} if @samp{rv}) capability
+remains active when this kicks in.
+The default is a bold red text foreground over the current line background.
+
+@item mc=01;31
+@vindex mc GREP_COLORS @r{capability}
+SGR substring for matching non-empty text in a context line.
+(This is only used when the @samp{-v} command-line option is specified.)
+The effect of the @samp{cx=} (or @samp{sl=} if @samp{rv}) capability
+remains active when this kicks in.
+The default is a bold red text foreground over the current line background.
+
+@item fn=35
+@vindex fn GREP_COLORS @r{capability}
+SGR substring for file names prefixing any content line.
+The default is a magenta text foreground over the terminal's default background.
+
+@item ln=32
+@vindex ln GREP_COLORS @r{capability}
+SGR substring for line numbers prefixing any content line.
+The default is a green text foreground over the terminal's default background.
+
+@item bn=32
+@vindex bn GREP_COLORS @r{capability}
+SGR substring for byte offsets prefixing any content line.
+The default is a green text foreground over the terminal's default background.
+
+@item se=36
+@vindex fn GREP_COLORS @r{capability}
+SGR substring for separators that are inserted
+between selected line fields (@samp{:}),
+between context line fields (@samp{-}),
+and between groups of adjacent lines
+when nonzero context is specified (@samp{--}).
+The default is a cyan text foreground over the terminal's default background.
+
+@item ne
+@vindex ne GREP_COLORS @r{capability}
+Boolean value that prevents clearing to the end of line
+using Erase in Line (EL) to Right (@samp{\33[K})
+each time a colorized item ends.
+This is needed on terminals on which EL is not supported.
+It is otherwise useful on terminals
+for which the @code{back_color_erase}
+(@code{bce}) boolean terminfo capability does not apply,
+when the chosen highlight colors do not affect the background,
+or when EL is too slow or causes too much flicker.
+The default is false (i.e., the capability is omitted).
+@end table
+
+Note that boolean capabilities have no @samp{=}... part.
+They are omitted (i.e., false) by default and become true when specified.
+
+See the Select Graphic Rendition (SGR) section
+in the documentation of the text terminal that is used
+for permitted values and their meaning as character attributes.
+These substring values are integers in decimal representation
+and can be concatenated with semicolons.
+@command{grep} takes care of assembling the result
+into a complete SGR sequence (@samp{\33[}...@samp{m}).
+Common values to concatenate include
+@samp{1} for bold,
+@samp{4} for underline,
+@samp{5} for blink,
+@samp{7} for inverse,
+@samp{39} for default foreground color,
+@samp{30} to @samp{37} for foreground colors,
+@samp{90} to @samp{97} for 16-color mode foreground colors,
+@samp{38;5;0} to @samp{38;5;255}
+for 88-color and 256-color modes foreground colors,
+@samp{49} for default background color,
+@samp{40} to @samp{47} for background colors,
+@samp{100} to @samp{107} for 16-color mode background colors,
+and @samp{48;5;0} to @samp{48;5;255}
+for 88-color and 256-color modes background colors.
@item LC_ALL
@itemx LC_COLLATE
@itemx LANG
-@vindex LC_ALL
-@vindex LC_COLLATE
-@vindex LANG
+@vindex LC_ALL @r{environment variable}
+@vindex LC_COLLATE @r{environment variable}
+@vindex LANG @r{environment variable}
@cindex character type
@cindex national language support
@cindex NLS
-These variables specify the @code{LC_COLLATE} locale, which determines
-the collating sequence used to interpret range expressions like
-@samp{[a-z]}.
+These variables specify the locale for the @code{LC_COLLATE} category,
+which determines the collating sequence
+used to interpret range expressions like @samp{[a-z]}.
@item LC_ALL
@itemx LC_CTYPE
@itemx LANG
-@vindex LC_ALL
-@vindex LC_CTYPE
-@vindex LANG
+@vindex LC_ALL @r{environment variable}
+@vindex LC_CTYPE @r{environment variable}
+@vindex LANG @r{environment variable}
@cindex character type
@cindex national language support
@cindex NLS
-These variables specify the @code{LC_CTYPE} locale, which determines the
-type of characters, e.g., which characters are whitespace.
+These variables specify the locale for the @code{LC_CTYPE} category,
+which determines the type of characters,
+e.g., which characters are whitespace.
@item LC_ALL
@itemx LC_MESSAGES
@itemx LANG
-@vindex LC_ALL
-@vindex LC_MESSAGES
-@vindex LANG
+@vindex LC_ALL @r{environment variable}
+@vindex LC_MESSAGES @r{environment variable}
+@vindex LANG @r{environment variable}
@cindex language of messages
@cindex message language
@cindex national language support
@cindex NLS
@cindex translation of message language
-These variables specify the @code{LC_MESSAGES} locale, which determines
-the language that @command{grep} uses for messages. The default C
-locale uses American English messages.
+These variables specify the locale for the @code{LC_MESSAGES} category,
+which determines the language that @command{grep} uses for messages.
+The default @samp{C} locale uses American English messages.
@item POSIXLY_CORRECT
-@vindex POSIXLY_CORRECT
+@vindex POSIXLY_CORRECT @r{environment variable}
If set, @command{grep} behaves as @sc{posix.2} requires; otherwise,
-@command{grep} behaves more like other @sc{gnu} programs. @sc{posix.2}
+@command{grep} behaves more like other @sc{gnu} programs.
+@sc{posix.2}
requires that options that
-follow file names must be treated as file names; by default, such
-options are permuted to the front of the operand list and are treated as
-options. Also, @sc{posix.2} requires that unrecognized options be
-diagnosed as
-``illegal'', but since they are not really against the law the default
-is to diagnose them as ``invalid''. @code{POSIXLY_CORRECT} also
-disables @code{_@var{N}_GNU_nonoption_argv_flags_}, described below.
+follow file names must be treated as file names;
+by default,
+such options are permuted to the front of the operand list
+and are treated as options.
+Also,
+@sc{posix.2} requires that unrecognized options be diagnosed as ``illegal'',
+but since they are not really against the law the default
+is to diagnose them as ``invalid''.
+@code{POSIXLY_CORRECT} also disables @code{_@var{N}_GNU_nonoption_argv_flags_},
+described below.
@item _@var{N}_GNU_nonoption_argv_flags_
-@vindex _@var{N}_GNU_nonoption_argv_flags_
-(Here @code{@var{N}} is @command{grep}'s numeric process ID.) If the
-@var{i}th character of this environment variable's value is @samp{1}, do
-not consider the @var{i}th operand of @command{grep} to be an option, even if
-it appears to be one. A shell can put this variable in the environment
-for each command it runs, specifying which operands are the results of
-file name wildcard expansion and therefore should not be treated as
-options. This behavior is available only with the @sc{gnu} C library, and
-only when @code{POSIXLY_CORRECT} is not set.
+@vindex _@var{N}_GNU_nonoption_argv_flags_ @r{environment variable}
+(Here @code{@var{N}} is @command{grep}'s numeric process ID.)
+If the @var{i}th character of this environment variable's value is @samp{1},
+do not consider the @var{i}th operand of @command{grep} to be an option,
+even if it appears to be one.
+A shell can put this variable in the environment for each command it runs,
+specifying which operands are the results of file name wildcard expansion
+and therefore should not be treated as options.
+This behavior is available only with the @sc{gnu} C library,
+and only when @code{POSIXLY_CORRECT} is not set.
@end table
-@node Diagnostics
-@chapter Diagnostics
+@node Exit Status
+@chapter Exit Status
-Normally, exit status is 0 if selected lines are found and 1 otherwise.
+Normally, the exit status is 0 if selected lines are found and 1 otherwise.
But the exit status is 2 if an error occurred, unless the @option{-q} or
@option{--quiet} or @option{--silent} option is used and a selected line
is found.
@@ -712,12 +992,12 @@ for programs such as @command{grep}, @command{cmp}, and @command{diff},
that the exit status in case of error be greater than 1;
it is therefore advisable, for the sake of portability,
to use logic that tests for this general condition
-instead of strict equality with 2.
+instead of strict equality with@ 2.
@node grep Programs
@chapter @command{grep} programs
-The @command{grep} command searches the named input files
+@command{grep} searches the named input files
(or standard input if no files are named,
or the file name @file{-} is given)
for lines containing a match to the given pattern.
@@ -732,14 +1012,16 @@ controlled by the following options.
@opindex -G
@opindex --basic-regexp
@cindex matching basic regular expressions
-Interpret the pattern as a basic regular expression. This is the default.
+Interpret the pattern as a basic regular expression (BRE).
+This is the default.
@item -E
@itemx --extended-regexp
@opindex -E
@opindex --extended-regexp
@cindex matching extended regular expressions
-Interpret the pattern as an extended regular expression.
+Interpret the pattern as an extended regular expression (ERE).
+(@samp{-E} is specified by @sc{posix}.)
@item -F
@itemx --fixed-strings
@@ -748,6 +1030,7 @@ Interpret the pattern as an extended regular expression.
@cindex matching fixed strings
Interpret the pattern as a list of fixed strings, separated
by newlines, any of which is to be matched.
+(@samp{-F} is specified by @sc{posix}.)
@item -P
@itemx --perl-regexp
@@ -755,6 +1038,8 @@ by newlines, any of which is to be matched.
@opindex --perl-regexp
@cindex matching Perl regular expressions
Interpret the pattern as a Perl regular expression.
+This is highly experimental and
+@samp{grep@ -P} may warn of unimplemented features.
@end table
@@ -774,16 +1059,32 @@ that rely on them to run unmodified.
A @dfn{regular expression} is a pattern that describes a set of strings.
Regular expressions are constructed analogously to arithmetic expressions,
by using various operators to combine smaller expressions.
-@command{grep} understands two different versions of regular expression
-syntax: ``basic''(BRE) and ``extended''(ERE). In @sc{gnu} @command{grep},
+@command{grep} understands
+two different versions of regular expression syntax:
+``basic''(BRE) and ``extended''(ERE).
+In @sc{gnu} @command{grep},
there is no difference in available functionality using either syntax.
In other implementations, basic regular expressions are less powerful.
The following description applies to extended regular expressions;
differences for basic regular expressions are summarized afterwards.
+@menu
+* Fundamental Structure::
+* Character Classes and Bracket Expressions::
+* The Backslash Character and Special Expressions::
+* Anchoring::
+* Back-references and Subexpressions::
+* Basic vs Extended::
+@end menu
+
+@node Fundamental Structure
+@section Fundamental Structure
+
The fundamental building blocks are the regular expressions that match
-a single character. Most characters, including all letters and digits,
-are regular expressions that match themselves. Any metacharacter
+a single character.
+Most characters, including all letters and digits,
+are regular expressions that match themselves.
+Any meta-character
with special meaning may be quoted by preceding it with a backslash.
A regular expression may be followed by one of several
@@ -800,153 +1101,173 @@ The period @samp{.} matches any single character.
@item ?
@opindex ?
@cindex question mark
-@cindex match sub-expression at most once
+@cindex match expression at most once
The preceding item is optional and will be matched at most once.
@item *
@opindex *
@cindex asterisk
-@cindex match sub-expression zero or more times
+@cindex match expression zero or more times
The preceding item will be matched zero or more times.
@item +
@opindex +
@cindex plus sign
+@cindex match expression one or more times
The preceding item will be matched one or more times.
@item @{@var{n}@}
-@opindex @{n@}
+@opindex @{@var{n}@}
@cindex braces, one argument
-@cindex match sub-expression n times
+@cindex match expression @var{n} times
The preceding item is matched exactly @var{n} times.
@item @{@var{n},@}
-@opindex @{n,@}
+@opindex @{@var{n},@}
@cindex braces, second argument omitted
-@cindex match sub-expression n or more times
-The preceding item is matched n or more times.
+@cindex match expression @var{n} or more times
+The preceding item is matched @var{n} or more times.
+
+@item @{,@var{m}@}
+@opindex @{,@var{m}@}
+@cindex braces, first argument omitted
+@cindex match expression at most @var{m} times
+The preceding item is matched at most @var{m} times.
@item @{@var{n},@var{m}@}
-@opindex @{n,m@}
+@opindex @{@var{n},@var{m}@}
@cindex braces, two arguments
+@cindex match expression from @var{n} to @var{m} times
The preceding item is matched at least @var{n} times, but not more than
@var{m} times.
@end table
-Two regular expressions may be concatenated; the resulting regular
-expression matches any string formed by concatenating two substrings
-that respectively match the concatenated subexpressions.
+Two regular expressions may be concatenated;
+the resulting regular expression
+matches any string formed by concatenating two substrings
+that respectively match the concatenated expressions.
-Two regular expressions may be joined by the infix operator @samp{|}; the
-resulting regular expression matches any string matching either subexpression.
+Two regular expressions may be joined by the infix operator @samp{|};
+the resulting regular expression
+matches any string matching either alternalte expression.
-Repetition takes precedence over concatenation, which in turn
-takes precedence over alternation. A whole subexpression may be
-enclosed in parentheses to override these precedence rules.
+Repetition takes precedence over concatenation,
+which in turn takes precedence over alternation.
+A whole expression may be enclosed in parentheses
+to override these precedence rules and form a subexpression.
-@section Character Class
+@node Character Classes and Bracket Expressions
+@section Character Classes and Bracket Expressions
@cindex bracket expression
@cindex character class
A @dfn{bracket expression} is a list of characters enclosed by @samp{[} and
-@samp{]}. It matches any single character in that list; if the first
-character of the list is the caret @samp{^}, then it matches any character
-@strong{not} in the list. For example, the regular expression
+@samp{]}.
+It matches any single character in that list;
+if the first character of the list is the caret @samp{^},
+then it matches any character @strong{not} in the list.
+For example, the regular expression
@samp{[0123456789]} matches any single digit.
@cindex range expression
Within a bracket expression, a @dfn{range expression} consists of two
-characters separated by a hyphen. It matches any single character that
+characters separated by a hyphen.
+It matches any single character that
sorts between the two characters, inclusive, using the locale's
-collating sequence and character set. For example, in the default C
-locale, @samp{[a-d]} is equivalent to @samp{[abcd]}. Many locales sort
+collating sequence and character set.
+For example, in the default C
+locale, @samp{[a-d]} is equivalent to @samp{[abcd]}.
+Many locales sort
characters in dictionary order, and in these locales @samp{[a-d]} is
-typically not equivalent to @samp{[abcd]}; it might be equivalent to
-@samp{[aBbCcDd]}, for example. To obtain the traditional interpretation
-of bracket expressions, you can use the C locale by setting the
+typically not equivalent to @samp{[abcd]};
+it might be equivalent to @samp{[aBbCcDd]}, for example.
+To obtain the traditional interpretation
+of bracket expressions, you can use the @samp{C} locale by setting the
@env{LC_ALL} environment variable to the value @samp{C}.
Finally, certain named classes of characters are predefined within
bracket expressions, as follows.
-Their interpretation depends on the @code{LC_CTYPE} locale; the
-interpretation below is that of the C locale, which is the default
-if no @code{LC_CTYPE} locale is specified.
+Their interpretation depends on the @code{LC_CTYPE} locale;
+the interpretation below is that of the @samp{C} locale,
+which is the default if no @code{LC_CTYPE} locale is specified.
@cindex classes of characters
@cindex character classes
@table @samp
@item [:alnum:]
-@opindex alnum
+@opindex alnum @r{character class}
@cindex alphanumeric characters
Alphanumeric characters:
@samp{[:alpha:]} and @samp{[:digit:]}.
@item [:alpha:]
-@opindex alpha
+@opindex alpha @r{character class}
@cindex alphabetic characters
Alphabetic characters:
@samp{[:lower:]} and @samp{[:upper:]}.
@item [:blank:]
-@opindex blank
+@opindex blank @r{character class}
@cindex blank characters
Blank characters:
space and tab.
@item [:cntrl:]
-@opindex cntrl
+@opindex cntrl @r{character class}
@cindex control characters
-Control characters. In @sc{ascii}, these characters have octal codes 000
-through 037, and 177 (@code{DEL}). In other character sets, these are
+Control characters.
+In @sc{ascii}, these characters have octal codes 000
+through 037, and 177 (@code{DEL}).
+In other character sets, these are
the equivalent characters, if any.
@item [:digit:]
-@opindex digit
+@opindex digit @r{character class}
@cindex digit characters
@cindex numeric characters
Digits: @code{0 1 2 3 4 5 6 7 8 9}.
@item [:graph:]
-@opindex graph
+@opindex graph @r{character class}
@cindex graphic characters
Graphical characters:
@samp{[:alnum:]} and @samp{[:punct:]}.
@item [:lower:]
-@opindex lower
+@opindex lower @r{character class}
@cindex lower-case letters
Lower-case letters:
@code{a b c d e f g h i j k l m n o p q r s t u v w x y z}.
@item [:print:]
-@opindex print
+@opindex print @r{character class}
@cindex printable characters
Printable characters:
@samp{[:alnum:]}, @samp{[:punct:]}, and space.
@item [:punct:]
-@opindex punct
+@opindex punct @r{character class}
@cindex punctuation characters
Punctuation characters:
@code{!@: " # $ % & ' ( ) * + , - .@: / : ; < = > ?@: @@ [ \ ] ^ _ ` @{ | @} ~}.
@item [:space:]
-@opindex space
+@opindex space @r{character class}
@cindex space characters
@cindex whitespace characters
Space characters:
tab, newline, vertical tab, form feed, carriage return, and space.
@item [:upper:]
-@opindex upper
+@opindex upper @r{character class}
@cindex upper-case letters
Upper-case letters:
@code{A B C D E F G H I J K L M N O P Q R S T U V W X Y Z}.
@item [:xdigit:]
-@opindex xdigit
+@opindex xdigit @r{character class}
@cindex xdigit class
@cindex hexadecimal digits
Hexadecimal digits:
@@ -954,18 +1275,19 @@ Hexadecimal digits:
@end table
For example, @samp{[[:alnum:]]} means @samp{[0-9A-Za-z]}, except the latter
-depends upon the C locale and the @sc{ascii} character
+depends upon the @samp{C} locale and the @sc{ascii} character
encoding, whereas the former is independent of locale and character set.
(Note that the brackets in these class names are
part of the symbolic names, and must be included in addition to
-the brackets delimiting the bracket list.)
+the brackets delimiting the bracket expression.)
-Most metacharacters lose their special meaning inside lists.
+Most meta-characters lose their special meaning inside bracket expressions.
@table @samp
@item ]
-ends the list if it's not the first list item. So, if you want to make
-the @samp{]} character a list item, you must put it first.
+ends the bracket expression if it's not the first list item.
+So, if you want to make the @samp{]} character a list item,
+you must put it first.
@item [.
represents the open collating symbol.
@@ -990,16 +1312,19 @@ represents the range if it's not first or last in a list or the ending point
of a range.
@item ^
-represents the characters not in the list. If you want to make the @samp{^}
+represents the characters not in the list.
+If you want to make the @samp{^}
character a list item, place it anywhere but first.
@end table
-@section Backslash Character
+@node The Backslash Character and Special Expressions
+@section The Backslash Character and Special Expressions
@cindex backslash
-The @samp{\} when followed by certain ordinary characters take a special
-meaning :
+The @samp{\} character,
+when followed by certain ordinary characters,
+takes a special meaning:
@table @samp
@@ -1019,49 +1344,56 @@ Match the empty string at the end of word.
Match word constituent, it is a synonym for @samp{[[:alnum:]]}.
@item @samp{\W}
-Match non word constituent, it is a synonym for @samp{[^[:alnum:]]}.
+Match non-word constituent, it is a synonym for @samp{[^[:alnum:]]}.
@end table
-For example , @samp{\brat\b} matches the separate word @samp{rat},
-@samp{c\Brat\Be} matches @samp{crate}, but @samp{dirty \Brat} doesn't
-match @samp{dirty rat}.
+For example, @samp{\brat\b} matches the separate word @samp{rat},
+@samp{\Brat\B} matches @samp{crate} but not @samp{furry rat}.
+@node Anchoring
@section Anchoring
@cindex anchoring
-The caret @samp{^} and the dollar sign @samp{$} are metacharacters that
+The caret @samp{^} and the dollar sign @samp{$} are meta-characters that
respectively match the empty string at the beginning and end of a line.
-@section Back-reference
+@node Back-references and Subexpressions
+@section Back-references and Subexpressions
+@cindex subexpression
@cindex back-reference
The back-reference @samp{\@var{n}}, where @var{n} is a single digit, matches
the substring previously matched by the @var{n}th parenthesized subexpression
-of the regular expression. For example, @samp{(a)\1} matches @samp{aa}.
-When use with alternation if the group does not participate in the match, then
-the back-reference makes the whole match fail. For example, @samp{a(.)|b\1}
-will not match @samp{ba}. When multiple regular expressions are given with
-@samp{-e} or from a file @samp{-f file}, the back-referecences are local to
-each expression.
-
-@section Basic vs Extended
+of the regular expression.
+For example, @samp{(a)\1} matches @samp{aa}.
+When used with alternation, if the group does not participate in the match then
+the back-reference makes the whole match fail.
+For example, @samp{a(.)|b\1}
+will not match @samp{ba}.
+When multiple regular expressions are given with
+@samp{-e} or from a file (@samp{-f file}),
+back-references are local to each expression.
+
+@node Basic vs Extended
+@section Basic vs Extended Regular Expressions
@cindex basic regular expressions
-In basic regular expressions the metacharacters @samp{?}, @samp{+},
+In basic regular expressions the meta-characters @samp{?}, @samp{+},
@samp{@{}, @samp{|}, @samp{(}, and @samp{)} lose their special meaning;
instead use the backslashed versions @samp{\?}, @samp{\+}, @samp{\@{},
@samp{\|}, @samp{\(}, and @samp{\)}.
@cindex interval specifications
-Traditional @command{egrep} did not support the @samp{@{} metacharacter,
+Traditional @command{egrep} did not support the @samp{@{} meta-character,
and some @command{egrep} implementations support @samp{\@{} instead, so
portable scripts should avoid @samp{@{} in @samp{grep@ -E} patterns and
should use @samp{[@{]} to match a literal @samp{@{}.
@sc{gnu} @command{grep@ -E} attempts to support traditional usage by
assuming that @samp{@{} is not special if it would be the start of an
-invalid interval specification. For example, the shell command
+invalid interval specification.
+For example, the command
@samp{grep@ -E@ '@{1'} searches for the two-character string @samp{@{1}
instead of reporting a syntax error in the regular expression.
@sc{posix.2} allows this behavior as an extension, but portable scripts
@@ -1071,7 +1403,7 @@ should avoid it.
@chapter Usage
@cindex Usage, examples
-Here is an example shell command that invokes @sc{gnu} @command{grep}:
+Here is an example command that invokes @sc{gnu} @command{grep}:
@example
grep -i 'hello.*world' menu.h main.c
@@ -1081,9 +1413,11 @@ grep -i 'hello.*world' menu.h main.c
This lists all lines in the files @file{menu.h} and @file{main.c} that
contain the string @samp{hello} followed by the string @samp{world};
this is because @samp{.*} matches zero or more characters within a line.
-@xref{Regular Expressions}. The @samp{-i} option causes @command{grep}
+@xref{Regular Expressions}.
+The @samp{-i} option causes @command{grep}
to ignore case, causing it to match the line @samp{Hello, world!}, which
-it would not otherwise match. @xref{Invoking}, for more details about
+it would not otherwise match.
+@xref{Invoking}, for more details about
how to invoke @command{grep}.
@cindex Using @command{grep}, Q&A
@@ -1111,25 +1445,32 @@ grep -r 'hello' /home/gigi
@end example
@noindent
-searches for @samp{hello} in all files under the directory
-@file{/home/gigi}. For more control of which files are searched, use
-@command{find}, @command{grep} and @command{xargs}. For example,
-the following command searches only C files:
+searches for @samp{hello} in all files
+under the @file{/home/gigi} directory.
+For more control over which files are searched,
+use @command{find}, @command{grep}, and @command{xargs}.
+For example, the following command searches only C files:
-@smallexample
-find /home/gigi -name '*.c' -print | xargs grep 'hello' /dev/null
-@end smallexample
+@example
+find /home/gigi -name '*.c' -print0 | xargs -0r grep -H 'hello'
+@end example
This differs from the command:
@example
-grep -r 'hello' *.c
+grep -rH 'hello' *.c
@end example
which merely looks for @samp{hello} in all files in the current
-directory whose names end in @samp{.c}. Here the @option{-r} is
+directory whose names end in @samp{.c}.
+Here the @option{-r} is
probably unnecessary, as recursion occurs only in the unlikely event
that one of @samp{.c} files is a directory.
+The @samp{find ...} command line above is more similar to the command:
+
+@example
+grep -rH --include='*.c' 'hello' /home/gigi
+@end example
@item
What if a pattern has a leading @samp{-}?
@@ -1139,7 +1480,8 @@ grep -e '--cut here--' *
@end example
@noindent
-searches for all lines matching @samp{--cut here--}. Without @samp{-e},
+searches for all lines matching @samp{--cut here--}.
+Without @samp{-e},
@command{grep} would attempt to parse @samp{--cut here--} as a list of
options.
@@ -1151,9 +1493,11 @@ grep -w 'hello' *
@end example
@noindent
-searches only for instances of @samp{hello} that are entire words; it
-does not match @samp{Othello}. For more control, use @samp{\<} and
-@samp{\>} to match the start and end of words. For example:
+searches only for instances of @samp{hello} that are entire words;
+it does not match @samp{Othello}.
+For more control, use @samp{\<} and
+@samp{\>} to match the start and end of words.
+For example:
@example
grep 'hello\>' *
@@ -1174,7 +1518,7 @@ grep -C 2 'hello' *
prints two lines of context around each matching line.
@item
-How do I force grep to print the name of the file?
+How do I force @command{grep} to print the name of the file?
Append @file{/dev/null}:
@@ -1184,9 +1528,15 @@ grep 'eli' /etc/passwd /dev/null
gets you:
-@smallexample
-/etc/passwd:eli:DNGUTF58.IMe.:98:11:Eli Smith:/home/do/eli:/bin/bash
-@end smallexample
+@example
+/etc/passwd:eli:x:2098:1000:Eli Smith:/home/eli:/bin/bash
+@end example
+
+Alternatively, use @samp{-H}, which is a @sc{gnu} extension:
+
+@example
+grep -H 'eli' /etc/passwd
+@end example
@item
Why do people use strange regular expressions on @command{ps} output?
@@ -1198,8 +1548,9 @@ ps -ef | grep '[c]ron'
If the pattern had been written without the square brackets, it would
have matched not only the @command{ps} output line for @command{cron},
but also the @command{ps} output line for @command{grep}.
-Note that some platforms @command{ps} limit the ouput to the width
-of the screen, grep does not have any limit on the length of a line
+Note that on some platforms,
+@command{ps} limits the output to the width of the screen;
+@command{grep} does not have any limit on the length of a line
except the available memory.
@item
@@ -1207,18 +1558,22 @@ Why does @command{grep} report ``Binary file matches''?
If @command{grep} listed all matching ``lines'' from a binary file, it
would probably generate output that is not useful, and it might even
-muck up your display. So @sc{gnu} @command{grep} suppresses output from
-files that appear to be binary files. To force @sc{gnu} @command{grep}
+muck up your display.
+So @sc{gnu} @command{grep} suppresses output from
+files that appear to be binary files.
+To force @sc{gnu} @command{grep}
to output lines even from files that appear to be binary, use the
-@samp{-a} or @samp{--binary-files=text} option. To eliminate the
+@samp{-a} or @samp{--binary-files=text} option.
+To eliminate the
``Binary file matches'' messages, use the @samp{-I} or
@samp{--binary-files=without-match} option.
@item
-Why doesn't @samp{grep -lv} print nonmatching file names?
+Why doesn't @samp{grep -lv} print non-matching file names?
@samp{grep -lv} lists the names of all files containing one or more
-lines that do not match. To list the names of all files that contain no
+lines that do not match.
+To list the names of all files that contain no
matching lines, use the @samp{-L} or @samp{--files-without-match}
option.
@@ -1245,8 +1600,9 @@ cat /etc/passwd | grep 'alain' - /etc/motd
@cindex palindromes
How to express palindromes in a regular expression?
-It can be done by using the back referecences, for example a palindrome
-of 4 chararcters can be written in BRE.
+It can be done by using back-references;
+for example,
+a palindrome of 4 characters can be written with a BRE:
@example
grep -w -e '\(.\)\(.\).\2\1' file
@@ -1254,15 +1610,16 @@ grep -w -e '\(.\)\(.\).\2\1' file
It matches the word "radar" or "civic".
-Guglielmo Bondioni proposed a single RE that finds all the palindromes up to 19
-characters long.
+Guglielmo Bondioni proposed a single RE
+that finds all palindromes up to 19 characters long
+using @w{9 subexpressions} and @w{9 back-references}:
-@example
+@smallexample
grep -E -e '^(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?).?\9\8\7\6\5\4\3\2\1$' file
-@end example
+@end smallexample
-Note this is done by using GNU ERE extensions, it might not be portable on
-other greps.
+Note this is done by using @sc{gnu} ERE extensions;
+it might not be portable to other implementations of @command{grep}.
@item
Why is this back-reference failing?
@@ -1273,16 +1630,17 @@ echo 'ba' | grep -E '(a)\1|b\1'
This gives no output, because the first alternate @samp{(a)\1} does not match,
as there is no @samp{aa} in the input, so the @samp{\1} in the second alternate
-has nothing to refer back to, meaning it will never match anything. (The
-second alternate in this example can only match if the first alternate has
-matched -- making the second one superfluous.)
+has nothing to refer back to, meaning it will never match anything.
+(The second alternate in this example can only match
+if the first alternate has matched -- making the second one superfluous.)
@item
What do @command{grep}, @command{fgrep}, and @command{egrep} stand for?
-The name @command{grep} comes from the way line editing was done on Unix. For
-example, @command{ed} uses the following syntax to print a list of matching
-lines on the screen:
+The name @command{grep} comes from the way line editing was done on Unix.
+For example,
+@command{ed} uses the following syntax
+to print a list of matching lines on the screen:
@example
global/regular expression/print
@@ -1298,12 +1656,21 @@ g/re/p
@chapter Reporting bugs
@cindex Bugs, reporting
-Email bug reports to @email{bug-grep@@gnu.org}.
+Email bug reports to @email{bug-grep@@gnu.org},
+a mailing list whose web page is
+@url{http://lists.gnu.org/mailman/listinfo/bug-grep}.
+The Savannah bug tracker for @command{grep} is located at
+@url{http://savannah.gnu.org/bugs/?group=grep}.
+
+@section Known Bugs
+@cindex Bugs, known
Large repetition counts in the @samp{@{n,m@}} construct may cause
-@command{grep} to use lots of memory. In addition, certain other
+@command{grep} to use lots of memory.
+In addition, certain other
obscure regular expressions require exponential time and
-space, and may cause grep to run out of memory.
+space, and may cause @command{grep} to run out of memory.
+
Back-references are very slow, and may require exponential time.
@node Copying, GNU General Public License, Reporting Bugs, Top
@@ -1313,7 +1680,7 @@ GNU grep is licensed under the GNU GPL, which makes it @dfn{free
software}.
Please note that ``free'' in ``free software'' refers to liberty, not
-price. As some GNU project advocates like to point out, think of ``free
+price. As some GNU project advocates like to point out, think of ``free
speech'' rather than ``free beer''. The exact and legally binding
distribution terms are spelled out below; in short, you have the right
(freedom) to run and change grep and distribute it to other people, and
@@ -1368,8 +1735,8 @@ The full texts of the GNU General Public License and of the GNU Free
Documentation License are available below.
@menu
-* GNU General Public License:: GNU GPL
-* GNU Free Documentation License:: GNU FDL
+* GNU General Public License:: GNU GPL
+* GNU Free Documentation License:: GNU FDL
@end menu
@node GNU General Public License, GNU Free Documentation License, Copying, Copying
@@ -1385,7 +1752,7 @@ Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
@end display
-@unnumberedsec Preamble
+@unnumberedsubsec Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
@@ -1436,7 +1803,7 @@ patent must be licensed for everyone's free use or not licensed at all.
modification follow.
@iftex
-@unnumberedsec TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+@unnumberedsubsec TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
@end iftex
@ifinfo
@center TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
@@ -1699,7 +2066,7 @@ POSSIBILITY OF SUCH DAMAGES.
@end ifinfo
@page
-@unnumberedsec How to Apply These Terms to Your New Programs
+@unnumberedsubsec How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
@@ -2109,7 +2476,7 @@ as a draft) by the Free Software Foundation.
@end enumerate
-@unnumberedsec ADDENDUM: How to use this License for your documents
+@unnumberedsubsec ADDENDUM: How to use this License for your documents
To use this License in a document you have written, include a copy of
the License in the document and put the following copyright and
@@ -2142,8 +2509,13 @@ to permit their use in free software.
@node Concept Index, Index, GNU Free Documentation License, Top
@unnumbered Concept Index
-This is a general index of all issues discussed in this manual, with the
-exception of the @command{grep} commands and command-line options.
+This is a general index of all issues discussed in this manual,
+with the exception of all specific @command{grep}
+command-line options,
+environment variables,
+color capabilities,
+and regular expression constructs,
+which are covered in their own index.
@printindex cp
@@ -2151,8 +2523,11 @@ exception of the @command{grep} commands and command-line options.
@node Index,, Concept Index, Top
@unnumbered Index
-This is an alphabetical list of all @command{grep} commands, command-line
-options, and environment variables.
+This is an lexicographical list of all @command{grep}
+command-line options,
+environment variables,
+color capabilities,
+and regular expression constructs.
@printindex fn