diff options
Diffstat (limited to 'doc/gperf_8.html')
-rw-r--r-- | doc/gperf_8.html | 60 |
1 files changed, 24 insertions, 36 deletions
diff --git a/doc/gperf_8.html b/doc/gperf_8.html index 862b5ec..977e2cb 100644 --- a/doc/gperf_8.html +++ b/doc/gperf_8.html @@ -1,63 +1,51 @@ <HTML> <HEAD> <!-- This HTML file has been created by texi2html 1.51 - from gperf.texi on 15 April 1998 --> + from gperf.texi on 20 August 2000 --> -<TITLE>User's Guide to gperf - 5 Known Bugs and Limitations with gperf</TITLE> +<TITLE>Perfect Hash Function Generator - 6 Things Still Left to Do</TITLE> </HEAD> <BODY> Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_7.html">previous</A>, <A HREF="gperf_9.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. <P><HR><P> -<H1><A NAME="SEC20" HREF="gperf_toc.html#TOC20">5 Known Bugs and Limitations with <CODE>gperf</CODE></A></H1> +<H1><A NAME="SEC21" HREF="gperf_toc.html#TOC21">6 Things Still Left to Do</A></H1> <P> -The following are some limitations with the current release of -<CODE>gperf</CODE>: +It should be "relatively" easy to replace the current perfect hash +function algorithm with a more exhaustive approach; the perfect hash +module is essential independent from other program modules. Additional +worthwhile improvements include: </P> <UL> <LI> -The <CODE>gperf</CODE> utility is tuned to execute quickly, and works quickly -for small to medium size data sets (around 1000 keywords). It is -extremely useful for maintaining perfect hash functions for compiler -keyword sets. Several recent enhancements now enable <CODE>gperf</CODE> to -work efficiently on much larger keyword sets (over 15,000 keywords). -When processing large keyword sets it helps greatly to have over 8 megs -of RAM. - -However, since <CODE>gperf</CODE> does not backtrack no guaranteed solution -occurs on every run. On the other hand, it is usually easy to obtain a -solution by varying the option parameters. In particular, try the -<SAMP>`-r'</SAMP> option, and also try changing the default arguments to the -<SAMP>`-s'</SAMP> and <SAMP>`-j'</SAMP> options. To <EM>guarantee</EM> a solution, use -the <SAMP>`-D'</SAMP> and <SAMP>`-S'</SAMP> options, although the final results are not -likely to be a <EM>perfect</EM> hash function anymore! Finally, use the -<SAMP>`-f'</SAMP> option if you want <CODE>gperf</CODE> to generate the perfect hash -function <EM>fast</EM>, with less emphasis on making it minimal. +Make the algorithm more robust. At present, the program halts with an +error diagnostic if it can't find a direct solution and the <SAMP>`-D'</SAMP> +option is not enabled. A more comprehensive, albeit computationally +expensive, approach would employ backtracking or enable alternative +options and retry. It's not clear how helpful this would be, in +general, since most search sets are rather small in practice. <LI> -The size of the generate static keyword array can get <EM>extremely</EM> -large if the input keyword file is large or if the keywords are quite -similar. This tends to slow down the compilation of the generated C -code, and <EM>greatly</EM> inflates the object code size. If this -situation occurs, consider using the <SAMP>`-S'</SAMP> option to reduce data -size, potentially increasing keyword recognition time a negligible -amount. Since many C compilers cannot correctly generated code for -large switch statements it is important to qualify the <VAR>-S</VAR> option -with an appropriate numerical argument that controls the number of -switch statements generated. +Another useful extension involves modifying the program to generate +"minimal" perfect hash functions (under certain circumstances, the +current version can be rather extravagant in the generated table size). +Again, this is mostly of theoretical interest, since a sparse table +often produces faster lookups, and use of the <SAMP>`-S'</SAMP> <CODE>switch</CODE> +option can minimize the data size, at the expense of slightly longer +lookups (note that the gcc compiler generally produces good code for +<CODE>switch</CODE> statements, reducing the need for more complex schemes). <LI> -The maximum number of key positions selected for a given key has an -arbitrary limit of 126. This restriction should be removed, and if -anyone considers this a problem write me and let me know so I can remove -the constraint. +In addition to improving the algorithm, it would also be useful to +generate a C++ class or Ada package as the code output, in addition to +the current C routines. </UL> <P><HR><P> |