Document the use of NULs.

author: Bruno Haible <bruno@clisp.org> 2000-08-20 17:20:23 +0000
committer: Bruno Haible <bruno@clisp.org> 2000-08-20 17:20:23 +0000
commit: 1ad4108b348049bf3a4da86178fa66f5d36c6b3a (patch)
tree: df43c5baf63c8607273a5a718fcb21ac2fde4bdf /doc/gperf.texi
parent: c0eb5203949ad864470076ec23b6f348639fad2d (diff)
download: gperf-1ad4108b348049bf3a4da86178fa66f5d36c6b3a.tar.gz
1 files changed, 38 insertions, 10 deletions
diff --git a/doc/gperf.texi b/doc/gperf.texi
index 2b4caf6..93f1f23 100644
--- a/doc/gperf.texi
+++ b/doc/gperf.texi
@@ -115,6 +115,7 @@ High-Level Description of GNU @code{gperf}
 
 * Input Format::                Input Format to @code{gperf}
 * Output Format::               Output Format for Generated C Code with @code{gperf}
+* Binary Strings::              Use of NUL characters
 
 Input Format to @code{gperf}
 
@@ -259,6 +260,7 @@ efficiently identify their respective reserved keywords.
 @menu
 * Input Format::                Input Format to @code{gperf}
 * Output Format::               Output Format for Generated C Code with @code{gperf}
+* Binary Strings::              Use of NUL characters
 @end menu
 
 The perfect hash function generator @code{gperf} reads a set of
@@ -327,9 +329,9 @@ arbitrary C declarations and definitions, as well as provisions for
 providing a user-supplied @code{struct}.  If the @samp{-t} option
 @emph{is} enabled, you @emph{must} provide a C @code{struct} as the last
 component in the declaration section from the keyfile file.  The first
-field in this struct must be a @code{char *} identifier called @samp{name},
-although it is possible to modify this field's name with the @samp{-K}
-option described below.
+field in this struct must be a @code{char *} or @code{const char *}
+identifier called @samp{name}, although it is possible to modify this
+field's name with the @samp{-K} option described below.
 
 Here is a simple example, using months of the year and their attributes as
 input:
@@ -406,15 +408,18 @@ in the first column is considered a comment.  Everything following the
 @samp{#} is ignored, up to and including the following newline.
 
 The first field of each non-comment line is always the key itself.  It
-should be given as a simple name, i.e., without surrounding
-string quotation marks, and be left-justified flush against the first
-column.  In this context, a ``field'' is considered to extend up to, but
+can be given in two ways: as a simple name, i.e., without surrounding
+string quotation marks, or as a string enclosed in double-quotes, in
+C syntax, possibly with backslash escapes like @code{\"} or @code{\234}
+or @code{\xa8}. In either case, it must start right at the beginning
+of the line, without leading whitespace.
+In this context, a ``field'' is considered to extend up to, but
 not include, the first blank, comma, or newline.  Here is a simple
 example taken from a partial list of C reserved words:
 
 @example
 @group
-# These are a few C reserved words, see the c.@code{gperf} file 
+# These are a few C reserved words, see the c.gperf file 
 # for a complete list of ANSI C reserved words.
 unsigned
 sizeof
@@ -449,7 +454,7 @@ file, is included verbatim into the generated output file.  Naturally,
 it is your responsibility to ensure that the code contained in this
 section is valid C.
 
-@node Output Format,  , Input Format, Description
+@node Output Format, Binary Strings, Input Format, Description
 @section Output Format for Generated C Code with @code{gperf}
 @cindex hash table
 
@@ -509,6 +514,28 @@ with the various input and output options, and timing the resulting C
 code, you can determine the best option choices for different keyword
 set characteristics.
 
+@node Binary Strings,  , Output Format, Description
+@section Use of NUL characters
+@cindex NUL
+
+By default, the code generated by @code{gperf} operates on zero
+terminated strings, the usual representation of strings in C. This means
+that the keywords in the input file must not contain NUL characters,
+and the @var{str} argument passed to @code{hash} or @code{in_word_set}
+must be NUL terminated and have exactly length @var{len}.
+
+If option @samp{-c} is used, then the @var{str} argument does not need
+to be NUL terminated. The code generated by @code{gperf} will only
+access the first @var{len}, not @var{len+1}, bytes starting at @var{str}.
+However, the keywords in the input file still must not contain NUL
+characters.
+
+If option @samp{-l} is used, then the hash table performs binary
+comparison. The keywords in the input file may contain NUL characters,
+written in string syntax as @code{\000} or @code{\x00}, and the code
+generated by @code{gperf} will treat NUL like any other character.
+Also, in this case the @samp{-c} option is ignored.
+
 @node Options, Bugs, Description, Top
 @chapter Invoking @code{gperf}
 
@@ -636,8 +663,8 @@ solely consist of 7-bit ASCII characters (characters in the range 0..127).
 (Note that the ANSI C functions @code{isalnum} and @code{isgraph} do
 @emph{not} guarantee that a character is in this range. Only an explicit
 test like @samp{c >= 'A' && c <= 'Z'} guarantees this.) This was the
-default in earlier versions of @code{gperf}; now the default is to assume
-8-bit characters.
+default in versions of @code{gperf} earlier than 2.7; now the default is
+to assume 8-bit characters.
 
 @item -c
 @itemx --compare-strncmp
@@ -731,6 +758,7 @@ However, using @samp{-l} might greatly increase the size of the
 generated C code if the lookup table range is large (which implies that
 the switch option @samp{-S} is not enabled), since the length table
 contains as many elements as there are entries in the lookup table.
+This option is mandatory for binary comparisons (@pxref{Binary Strings}).
 
 @item -D
 @itemx --duplicates
author	Bruno Haible <bruno@clisp.org>	2000-08-20 17:20:23 +0000
committer	Bruno Haible <bruno@clisp.org>	2000-08-20 17:20:23 +0000
commit	1ad4108b348049bf3a4da86178fa66f5d36c6b3a (patch)
tree	df43c5baf63c8607273a5a718fcb21ac2fde4bdf /doc/gperf.texi
parent	c0eb5203949ad864470076ec23b6f348639fad2d (diff)
download	gperf-1ad4108b348049bf3a4da86178fa66f5d36c6b3a.tar.gz