summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorPaolo Bonzini <bonzini@gnu.org>2011-06-29 08:21:24 +0200
committerPaolo Bonzini <bonzini@gnu.org>2011-06-29 08:26:58 +0200
commit406ab56cdd977fbd05216392ac655dd7204a307a (patch)
tree02449d7d3b98924d24bdf8e5450011a674587181 /doc
parent3c3bdace487c2c961ab3126d9a573af29c449c8b (diff)
downloadgrep-406ab56cdd977fbd05216392ac655dd7204a307a.tar.gz
doc: improve documentation of character classes
* doc/grep.texi (Character classes): Mention explicitly when examples refer to the C locale, explain better the general meaning of character classes.
Diffstat (limited to 'doc')
-rw-r--r--doc/grep.texi26
1 files changed, 13 insertions, 13 deletions
diff --git a/doc/grep.texi b/doc/grep.texi
index 7c80f8c2..b1b879a1 100644
--- a/doc/grep.texi
+++ b/doc/grep.texi
@@ -1171,8 +1171,8 @@ of bracket expressions, you can use the @samp{C} locale by setting the
Finally, certain named classes of characters are predefined within
bracket expressions, as follows.
Their interpretation depends on the @code{LC_CTYPE} locale;
-the interpretation below is that of the @samp{C} locale,
-which is the default if no @code{LC_CTYPE} locale is specified.
+for example, [[:alnum:]] means the character class of numbers and letters
+in the current locale.
@cindex classes of characters
@cindex character classes
@@ -1182,13 +1182,13 @@ which is the default if no @code{LC_CTYPE} locale is specified.
@opindex alnum @r{character class}
@cindex alphanumeric characters
Alphanumeric characters:
-@samp{[:alpha:]} and @samp{[:digit:]}.
+@samp{[:alpha:]} and @samp{[:digit:]}; in the @samp{C} locale and @sc{ascii} character encoding, this is the same as @samp{[0-9A-Za-z]}.
@item [:alpha:]
@opindex alpha @r{character class}
@cindex alphabetic characters
Alphabetic characters:
-@samp{[:lower:]} and @samp{[:upper:]}.
+@samp{[:lower:]} and @samp{[:upper:]}; in the @samp{C} locale and @sc{ascii} character encoding, this is the same as @samp{[A-Za-z]}.
@item [:blank:]
@opindex blank @r{character class}
@@ -1220,7 +1220,8 @@ Graphical characters:
@item [:lower:]
@opindex lower @r{character class}
@cindex lower-case letters
-Lower-case letters:
+Lower-case letters; in the @samp{C} locale and @sc{ascii} character
+encoding, this is
@code{a b c d e f g h i j k l m n o p q r s t u v w x y z}.
@item [:print:]
@@ -1232,21 +1233,23 @@ Printable characters:
@item [:punct:]
@opindex punct @r{character class}
@cindex punctuation characters
-Punctuation characters:
+Punctuation characters; in the @samp{C} locale and @sc{ascii} character
+encoding, this is
@code{!@: " # $ % & ' ( ) * + , - .@: / : ; < = > ?@: @@ [ \ ] ^ _ ` @{ | @} ~}.
@item [:space:]
@opindex space @r{character class}
@cindex space characters
@cindex whitespace characters
-Space characters:
+Space characters: in the @samp{C} locale, this is
tab, newline, vertical tab, form feed, carriage return, and space.
@xref{Usage}, for more discussion of matching newlines.
@item [:upper:]
@opindex upper @r{character class}
@cindex upper-case letters
-Upper-case letters:
+Upper-case letters: in the @samp{C} locale and @sc{ascii} character
+encoding, this is
@code{A B C D E F G H I J K L M N O P Q R S T U V W X Y Z}.
@item [:xdigit:]
@@ -1257,12 +1260,9 @@ Hexadecimal digits:
@code{0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f}.
@end table
-For example, @samp{[[:alnum:]]} means @samp{[0-9A-Za-z]}, except the latter
-depends upon the @samp{C} locale and the @sc{ascii} character
-encoding, whereas the former is independent of locale and character set.
-(Note that the brackets in these class names are
+Note that the brackets in these class names are
part of the symbolic names, and must be included in addition to
-the brackets delimiting the bracket expression.)
+the brackets delimiting the bracket expression.
@anchor{invalid-bracket-expr}
If you mistakenly omit the outer brackets, and search for say, @samp{[:upper:]},