summaryrefslogtreecommitdiff
path: root/ghc/runtime/regex/ChangeLog
diff options
context:
space:
mode:
Diffstat (limited to 'ghc/runtime/regex/ChangeLog')
-rw-r--r--ghc/runtime/regex/ChangeLog3041
1 files changed, 3041 insertions, 0 deletions
diff --git a/ghc/runtime/regex/ChangeLog b/ghc/runtime/regex/ChangeLog
new file mode 100644
index 0000000000..c16096a838
--- /dev/null
+++ b/ghc/runtime/regex/ChangeLog
@@ -0,0 +1,3041 @@
+Tue Apr 25 10:51:27 1995 Sigbjorn Finne <sof@dcs.gla.ac.uk>
+
+ * Merged in the regex.c and regex.h of gawk-2.15.6, following a
+ suggestion on gnu.utils.bugs
+
+ * regex.h: Added defines for Perl syntax, RE_PERL_MULTILINE_SYNTAX
+ and RE_PERL_SINGLELINE_SYNTAX
+
+ * regex.c (regex_compile): Added handling of Perl operators,
+ nothing exciting - just different syntax for common operators.
+
+Fri Apr 2 17:31:59 1993 Jim Blandy (jimb@totoro.cs.oberlin.edu)
+
+ * Released version 0.12.
+
+ * regex.c (regerror): If errcode is zero, that's not a valid
+ error code, according to POSIX, but return "Success."
+
+ * regex.c (regerror): Remember to actually fetch the message
+ from re_error_msg.
+
+ * regex.c (regex_compile): Don't use the trick for ".*\n" on
+ ".+\n". Since the latter involves laying an extra choice
+ point, the backward jump isn't adjusted properly.
+
+Thu Mar 25 21:35:18 1993 Jim Blandy (jimb@totoro.cs.oberlin.edu)
+
+ * regex.c (regex_compile): In the handle_open and handle_close
+ sections, clear pending_exact to zero.
+
+Tue Mar 9 12:03:07 1993 Jim Blandy (jimb@wookumz.gnu.ai.mit.edu)
+
+ * regex.c (re_search_2): In the loop which searches forward
+ using fastmap, don't forget to cast the character from the
+ string to an unsigned before using it as an index into the
+ translate map.
+
+Thu Jan 14 15:41:46 1993 David J. MacKenzie (djm@kropotkin.gnu.ai.mit.edu)
+
+ * regex.h: Never define const; let the callers do it.
+ configure.in: Don't define USING_AUTOCONF.
+
+Wed Jan 6 20:49:29 1993 Jim Blandy (jimb@geech.gnu.ai.mit.edu)
+
+ * regex.c (regerror): Abort if ERRCODE is out of range.
+
+Sun Dec 20 16:19:10 1992 Jim Blandy (jimb@totoro.cs.oberlin.edu)
+
+ * configure.in: Arrange to #define USING_AUTOCONF.
+ * regex.h: If USING_AUTOCONF is #defined, don't mess with
+ `const' at all; autoconf has taken care of it.
+
+Mon Dec 14 21:40:39 1992 David J. MacKenzie (djm@kropotkin.gnu.ai.mit.edu)
+
+ * regex.h (RE_SYNTAX_AWK): Fix typo. From Arnold Robbins.
+
+Sun Dec 13 20:35:39 1992 Jim Blandy (jimb@totoro.cs.oberlin.edu)
+
+ * regex.c (compile_range): Fetch the range start and end by
+ casting the pattern pointer to an `unsigned char *' before
+ fetching through it.
+
+Sat Dec 12 09:41:01 1992 Jim Blandy (jimb@totoro.cs.oberlin.edu)
+
+ * regex.c: Undo change of 12/7/92; it's better for Emacs to
+ #define HAVE_CONFIG_H.
+
+Fri Dec 11 22:00:34 1992 Jim Meyering (meyering@hal.gnu.ai.mit.edu)
+
+ * regex.c: Define and use isascii-protected ctype.h macros.
+
+Fri Dec 11 05:10:38 1992 Jim Blandy (jimb@totoro.cs.oberlin.edu)
+
+ * regex.c (re_match_2): Undo Karl's November 10th change; it
+ keeps the group in :\(.*\) from matching :/ properly.
+
+Mon Dec 7 19:44:56 1992 Jim Blandy (jimb@wookumz.gnu.ai.mit.edu)
+
+ * regex.c: #include config.h if either HAVE_CONFIG_H or emacs
+ is #defined.
+
+Tue Dec 1 13:33:17 1992 David J. MacKenzie (djm@goldman.gnu.ai.mit.edu)
+
+ * regex.c [HAVE_CONFIG_H]: Include config.h.
+
+Wed Nov 25 23:46:02 1992 David J. MacKenzie (djm@goldman.gnu.ai.mit.edu)
+
+ * regex.c (regcomp): Add parens around bitwise & for clarity.
+ Initialize preg->allocated to prevent segv.
+
+Tue Nov 24 09:22:29 1992 David J. MacKenzie (djm@goldman.gnu.ai.mit.edu)
+
+ * regex.c: Use HAVE_STRING_H, not USG.
+ * configure.in: Check for string.h, not USG.
+
+Fri Nov 20 06:33:24 1992 Karl Berry (karl@cs.umb.edu)
+
+ * regex.c (SIGN_EXTEND_CHAR) [VMS]: Back out of this change,
+ since Roland Roberts now says it was a localism.
+
+Mon Nov 16 07:01:36 1992 Karl Berry (karl@cs.umb.edu)
+
+ * regex.h (const) [!HAVE_CONST]: Test another cpp symbol (from
+ Autoconf) before zapping const.
+
+Sun Nov 15 05:36:42 1992 Jim Blandy (jimb@wookumz.gnu.ai.mit.edu)
+
+ * regex.c, regex.h: Changes for VMS from Roland B Roberts
+ <roberts@nsrl31.nsrl.rochester.edu>.
+
+Thu Nov 12 11:31:15 1992 Karl Berry (karl@cs.umb.edu)
+
+ * Makefile.in (distfiles): Include INSTALL.
+
+Tue Nov 10 09:29:23 1992 Karl Berry (karl@cs.umb.edu)
+
+ * regex.c (re_match_2): At maybe_pop_jump, if at end of string
+ and pattern, just quit the matching loop.
+
+ * regex.c (LETTER_P): Rename to `WORDCHAR_P'.
+
+ * regex.c (AT_STRINGS_{BEG,END}): Take `d' as an arg; change
+ callers.
+
+ * regex.c (re_match_2) [!emacs]: In wordchar and notwordchar
+ cases, advance d.
+
+Wed Nov 4 15:43:58 1992 Karl Berry (karl@hal.gnu.ai.mit.edu)
+
+ * regex.h (const) [!__STDC__]: Don't define if it's already defined.
+
+Sat Oct 17 19:28:19 1992 Karl Berry (karl@cs.umb.edu)
+
+ * regex.c (bcmp, bcopy, bzero): Only #define if they are not
+ already #defined.
+
+ * configure.in: Use AC_CONST.
+
+Thu Oct 15 08:39:06 1992 Karl Berry (karl@cs.umb.edu)
+
+ * regex.h (const) [!const]: Conditionalize.
+
+Fri Oct 2 13:31:42 1992 Karl Berry (karl@cs.umb.edu)
+
+ * regex.h (RE_SYNTAX_ED): New definition.
+
+Sun Sep 20 12:53:39 1992 Karl Berry (karl@cs.umb.edu)
+
+ * regex.[ch]: remove traces of `longest_p' -- dumb idea to put
+ this into the pattern buffer, as it means parallelism loses.
+
+ * Makefile.in (config.status): use sh to run configure --no-create.
+
+ * Makefile.in (realclean): OK, don't remove configure.
+
+Sat Sep 19 09:05:08 1992 Karl Berry (karl@hayley)
+
+ * regex.c (PUSH_FAILURE_POINT, POP_FAILURE_POINT) [DEBUG]: keep
+ track of how many failure points we push and pop.
+ (re_match_2) [DEBUG]: declare variables for that, and print results.
+ (DEBUG_PRINT4): new macro.
+
+ * regex.h (re_pattern_buffer): new field `longest_p' (to
+ eliminate backtracking if the user doesn't need it).
+ * regex.c (re_compile_pattern): initialize it (to 1).
+ (re_search_2): set it to zero if register information is not needed.
+ (re_match_2): if it's set, don't backtrack.
+
+ * regex.c (re_search_2): update fastmap only after checking that
+ the pattern is anchored.
+
+ * regex.c (re_match_2): do more debugging at maybe_pop_jump.
+
+ * regex.c (re_search_2): cast result of TRANSLATE for use in
+ array subscript.
+
+Thu Sep 17 19:47:16 1992 Karl Berry (karl@geech.gnu.ai.mit.edu)
+
+ * Version 0.11.
+
+Wed Sep 16 08:17:10 1992 Karl Berry (karl@hayley)
+
+ * regex.c (INIT_FAIL_STACK): rewrite as statements instead of a
+ complicated comma expr, to avoid compiler warnings (and also
+ simplify).
+ (re_compile_fastmap, re_match_2): change callers.
+
+ * regex.c (POP_FAILURE_POINT): cast pop of regstart and regend
+ to avoid compiler warnings.
+
+ * regex.h (RE_NEWLINE_ORDINARY): remove this syntax bit, and
+ remove uses.
+ * regex.c (at_{beg,end}line_loc_p): go the last mile: remove
+ the RE_NEWLINE_ORDINARY case which made the ^ in \n^ be an anchor.
+
+Tue Sep 15 09:55:29 1992 Karl Berry (karl@hayley)
+
+ * regex.c (at_begline_loc_p): new fn.
+ (at_endline_loc_p): simplify at_endline_op_p.
+ (regex_compile): in ^/$ cases, call the above.
+
+ * regex.c (POP_FAILURE_POINT): rewrite the fn as a macro again,
+ as lord's profiling indicates the function is 20% of the time.
+ (re_match_2): callers changed.
+
+ * configure.in (AC_MEMORY_H): remove, since we never use memcpy et al.
+
+Mon Sep 14 17:49:27 1992 Karl Berry (karl@hayley)
+
+ * Makefile.in (makeargs): include MFLAGS.
+
+Sun Sep 13 07:41:45 1992 Karl Berry (karl@hayley)
+
+ * regex.c (regex_compile): in \1..\9 case, make it always
+ invalid to use \<digit> if there is no preceding <digit>th subexpr.
+ * regex.h (RE_NO_MISSING_BK_REF): remove this syntax bit.
+
+ * regex.c (regex_compile): remove support for invalid empty groups.
+ * regex.h (RE_NO_EMPTY_GROUPS): remove this syntax bit.
+
+ * regex.c (FREE_VARIABLES) [!REGEX_MALLOC]: define as alloca (0),
+ to reclaim memory.
+
+ * regex.h (RE_SYNTAX_POSIX_SED): don't bother with this.
+
+Sat Sep 12 13:37:21 1992 Karl Berry (karl@hayley)
+
+ * README: incorporate emacs.diff.
+
+ * regex.h (_RE_ARGS) [!__STDC__]: define as empty parens.
+
+ * configure.in: add AC_ALLOCA.
+
+ * Put test files in subdir test, documentation in subdir doc.
+ Adjust Makefile.in and configure.in accordingly.
+
+Thu Sep 10 10:29:11 1992 Karl Berry (karl@hayley)
+
+ * regex.h (RE_SYNTAX_{POSIX_,}SED): new definitions.
+
+Wed Sep 9 06:27:09 1992 Karl Berry (karl@hayley)
+
+ * Version 0.10.
+
+Tue Sep 8 07:32:30 1992 Karl Berry (karl@hayley)
+
+ * xregex.texinfo: put the day of month into the date.
+
+ * Makefile.in (realclean): remove Texinfo-generated files.
+ (distclean): remove empty sorted index files.
+ (clean): remove dvi files, etc.
+
+ * configure.in: test for more Unix variants.
+
+ * fileregex.c: new file.
+ Makefile.in (fileregex): new target.
+
+ * iregex.c (main): move variable decls to smallest scope.
+
+ * regex.c (FREE_VARIABLES): free reg_{,info_}dummy.
+ (re_match_2): check that the allocation for those two succeeded.
+
+ * regex.c (FREE_VAR): replace FREE_NONNULL with this.
+ (FREE_VARIABLES): call it.
+ (re_match_2) [REGEX_MALLOC]: initialize all our vars to NULL.
+
+ * tregress.c (do_match): generalize simple_match.
+ (SIMPLE_NONMATCH): new macro.
+ (SIMPLE_MATCH): change from routine.
+
+ * Makefile.in (regex.texinfo): make file readonly, so we don't
+ edit it by mistake.
+
+ * many files (re_default_syntax): rename to `re_syntax_options';
+ call re_set_syntax instead of assigning to the variable where
+ possible.
+
+Mon Sep 7 10:12:16 1992 Karl Berry (karl@hayley)
+
+ * syntax.skel: don't use prototypes.
+
+ * {configure,Makefile}.in: new files.
+
+ * regex.c: include <string.h> `#if USG || STDC_HEADERS'; remove
+ obsolete test for `POSIX', and test for BSRTING.
+ Include <strings.h> if we are not USG or STDC_HEADERS.
+ Do not include <unistd.h>. What did we ever need that for?
+
+ * regex.h (RE_NO_EMPTY_ALTS): remove this.
+ (RE_SYNTAX_AWK): remove from here, too.
+ * regex.c (regex_compile): remove the check.
+ * xregex.texinfo (Alternation Operator): update.
+ * other.c (test_others): remove tests for this.
+
+ * regex.h (RE_DUP_MAX): undefine if already defined.
+
+ * regex.h: (RE_SYNTAX_POSIX*): redo to allow more operators, and
+ define new syntaxes with the minimal set.
+
+ * syntax.skel (main): used sscanf instead of scanf.
+
+ * regex.h (RE_SYNTAX_*GREP): new definitions from mike.
+
+ * regex.c (regex_compile): initialize the upper bound of
+ intervals at the beginning of the interval, not the end.
+ (From pclink@qld.tne.oz.au.)
+
+ * regex.c (handle_bar): rename to `handle_alt', for consistency.
+
+ * regex.c ({store,insert}_{op1,op2}): new routines (except the last).
+ ({STORE,INSERT}_JUMP{,2}): macros to replace the old routines,
+ which took arguments in different orders, and were generally weird.
+
+ * regex.c (PAT_PUSH*): rename to `BUF_PUSH*' -- we're not
+ appending info to the pattern!
+
+Sun Sep 6 11:26:49 1992 Karl Berry (karl@hayley)
+
+ * regex.c (regex_compile): delete the variable
+ `following_left_brace', since we never use it.
+
+ * regex.c (print_compiled_pattern): don't print the fastmap if
+ it's null.
+
+ * regex.c (re_compile_fastmap): handle
+ `on_failure_keep_string_jump' like `on_failure_jump'.
+
+ * regex.c (re_match_2): in `charset{,_not' case, cast the bit
+ count to unsigned, not unsigned char, in case we have a full
+ 32-byte bit list.
+
+ * tregress.c (simple_match): remove.
+ (simple_test): rename as `simple_match'.
+ (simple_compile): print the error string if the compile failed.
+
+ * regex.c (DO_RANGE): rewrite as a function, `compile_range', so
+ we can debug it. Change pattern characters to unsigned char
+ *'s, and change the range variable to an unsigned.
+ (regex_compile): change calls.
+
+Sat Sep 5 17:40:49 1992 Karl Berry (karl@hayley)
+
+ * regex.h (_RE_ARGS): new macro to put in argument lists (if
+ ANSI) or omit them (if K&R); don't declare routines twice.
+
+ * many files (obscure_syntax): rename to `re_default_syntax'.
+
+Fri Sep 4 09:06:53 1992 Karl Berry (karl@hayley)
+
+ * GNUmakefile (extraclean): new target.
+ (realclean): delete the info files.
+
+Wed Sep 2 08:14:42 1992 Karl Berry (karl@hayley)
+
+ * regex.h: doc fix.
+
+Sun Aug 23 06:53:15 1992 Karl Berry (karl@hayley)
+
+ * regex.[ch] (re_comp): no const in the return type (from djm).
+
+Fri Aug 14 07:25:46 1992 Karl Berry (karl@hayley)
+
+ * regex.c (DO_RANGE): declare variables as unsigned chars, not
+ signed chars (from jimb).
+
+Wed Jul 29 18:33:53 1992 Karl Berry (karl@claude.cs.umb.edu)
+
+ * Version 0.9.
+
+ * GNUmakefile (distclean): do not remove regex.texinfo.
+ (realclean): remove it here.
+
+ * tregress.c (simple_test): initialize buf.buffer.
+
+Sun Jul 26 08:59:38 1992 Karl Berry (karl@hayley)
+
+ * regex.c (push_dummy_failure): new opcode and corresponding
+ case in the various routines. Pushed at the end of
+ alternatives.
+
+ * regex.c (jump_past_next_alt): rename to `jump_past_alt', for
+ brevity.
+ (no_pop_jump): rename to `jump'.
+
+ * regex.c (regex_compile) [DEBUG]: terminate printing of pattern
+ with a newline.
+
+ * NEWS: new file.
+
+ * tregress.c (simple_{compile,match,test}): routines to simplify all
+ these little tests.
+
+ * tregress.c: test for matching as much as possible.
+
+Fri Jul 10 06:53:32 1992 Karl Berry (karl@hayley)
+
+ * Version 0.8.
+
+Wed Jul 8 06:39:31 1992 Karl Berry (karl@hayley)
+
+ * regex.c (SIGN_EXTEND_CHAR): #undef any previous definition, as
+ ours should always work properly.
+
+Mon Jul 6 07:10:50 1992 Karl Berry (karl@hayley)
+
+ * iregex.c (main) [DEBUG]: conditionalize the call to
+ print_compiled_pattern.
+
+ * iregex.c (main): initialize buf.buffer to NULL.
+ * tregress (test_regress): likewise.
+
+ * regex.c (alloca) [sparc]: #if on HAVE_ALLOCA_H instead.
+
+ * tregress.c (test_regress): didn't have jla's test quite right.
+
+Sat Jul 4 09:02:12 1992 Karl Berry (karl@hayley)
+
+ * regex.c (re_match_2): only REGEX_ALLOCATE all the register
+ vectors if the pattern actually has registers.
+ (match_end): new variable to avoid having to use best_regend[0].
+
+ * regex.c (IS_IN_FIRST_STRING): rename to FIRST_STRING_P.
+
+ * regex.c: doc fixes.
+
+ * tregess.c (test_regress): new fastmap test forwarded by rms.
+
+ * tregress.c (test_regress): initialize the fastmap field.
+
+ * tregress.c (test_regress): new test from jla that aborted
+ in re_search_2.
+
+Fri Jul 3 09:10:05 1992 Karl Berry (karl@hayley)
+
+ * tregress.c (test_regress): add tests for translating charsets,
+ from kaoru.
+
+ * GNUmakefile (common): add alloca.o.
+ * alloca.c: new file, copied from bison.
+
+ * other.c (test_others): remove var `buf', since it's no longer used.
+
+ * Below changes from ro@TechFak.Uni-Bielefeld.DE.
+
+ * tregress.c (test_regress): initialize buf.allocated.
+
+ * regex.c (re_compile_fastmap): initialize `succeed_n_p'.
+
+ * GNUmakefile (regex): depend on $(common).
+
+Wed Jul 1 07:12:46 1992 Karl Berry (karl@hayley)
+
+ * Version 0.7.
+
+ * regex.c: doc fixes.
+
+Mon Jun 29 08:09:47 1992 Karl Berry (karl@fosse)
+
+ * regex.c (pop_failure_point): change string vars to
+ `const char *' from `unsigned char *'.
+
+ * regex.c: consolidate debugging stuff.
+ (print_partial_compiled_pattern): avoid enum clash.
+
+Mon Jun 29 07:50:27 1992 Karl Berry (karl@hayley)
+
+ * xmalloc.c: new file.
+ * GNUmakefile (common): add it.
+
+ * iregex.c (print_regs): new routine (from jimb).
+ (main): call it.
+
+Sat Jun 27 10:50:59 1992 Jim Blandy (jimb@pogo.cs.oberlin.edu)
+
+ * xregex.c (re_match_2): When we have accepted a match and
+ restored d from best_regend[0], we need to set dend
+ appropriately as well.
+
+Sun Jun 28 08:48:41 1992 Karl Berry (karl@hayley)
+
+ * tregress.c: rename from regress.c.
+
+ * regex.c (print_compiled_pattern): improve charset case to ease
+ byte-counting.
+ Also, don't distinguish between Emacs and non-Emacs
+ {not,}wordchar opcodes.
+
+ * regex.c (print_fastmap): move here.
+ * test.c: from here.
+ * regex.c (print_{{partial,}compiled_pattern,double_string}):
+ rename from ..._printer. Change calls here and in test.c.
+
+ * regex.c: create from xregex.c and regexinc.c for once and for
+ all, and change the debug fns to be extern, instead of static.
+ * GNUmakefile: remove traces of xregex.c.
+ * test.c: put in externs, instead of including regexinc.c.
+
+ * xregex.c: move interactive main program and scanstring to iregex.c.
+ * iregex.c: new file.
+ * upcase.c, printchar.c: new files.
+
+ * various doc fixes and other cosmetic changes throughout.
+
+ * regexinc.c (compiled_pattern_printer): change variable name,
+ for consistency.
+ (partial_compiled_pattern_printer): print other info about the
+ compiled pattern, besides just the opcodes.
+ * xregex.c (regex_compile) [DEBUG]: print the compiled pattern
+ when we're done.
+
+ * xregex.c (re_compile_fastmap): in the duplicate case, set
+ `can_be_null' and return.
+ Also, set `bufp->can_be_null' according to a new variable,
+ `path_can_be_null'.
+ Also, rewrite main while loop to not test `p != NULL', since
+ we never set it that way.
+ Also, eliminate special `can_be_null' value for the endline case.
+ (re_search_2): don't test for the special value.
+ * regex.h (struct re_pattern_buffer): remove the definition.
+
+Sat Jun 27 15:00:40 1992 Karl Berry (karl@hayley)
+
+ * xregex.c (re_compile_fastmap): remove the `RE_' from
+ `REG_RE_MATCH_NULL_AT_END'.
+ Also, assert the fastmap in the pattern buffer is non-null.
+ Also, reset `succeed_n_p' after we've
+ paid attention to it, instead of every time through the loop.
+ Also, in the `anychar' case, only clear fastmap['\n'] if the
+ syntax says to, and don't return prematurely.
+ Also, rearrange cases in some semblance of a rational order.
+ * regex.h (REG_RE_MATCH_NULL_AT_END): remove the `RE_' from the name.
+
+ * other.c: take bug reports from here.
+ * regress.c: new file for them.
+ * GNUmakefile (test): add it.
+ * main.c (main): new possible test.
+ * test.h (test_type): new value in enum.
+
+Thu Jun 25 17:37:43 1992 Karl Berry (karl@hayley)
+
+ * xregex.c (scanstring) [test]: new function from jimb to allow some
+ escapes.
+ (main) [test]: call it (on the string, not the pattern).
+
+ * xregex.c (main): make return type `int'.
+
+Wed Jun 24 10:43:03 1992 Karl Berry (karl@hayley)
+
+ * xregex.c (pattern_offset_t): change to `int', for the benefit
+ of patterns which compile to more than 2^15 bytes.
+
+ * xregex.c (GET_BUFFER_SPACE): remove spurious braces.
+
+ * xregex.texinfo (Using Registers): put in a stub to ``document''
+ the new function.
+ * regex.h (re_set_registers) [!__STDC__]: declare.
+ * xregex.c (re_set_registers): declare K&R style (also move to a
+ different place in the file).
+
+Mon Jun 8 18:03:28 1992 Jim Blandy (jimb@pogo.cs.oberlin.edu)
+
+ * regex.h (RE_NREGS): Doc fix.
+
+ * xregex.c (re_set_registers): New function.
+ * regex.h (re_set_registers): Declaration for new function.
+
+Fri Jun 5 06:55:18 1992 Karl Berry (karl@hayley)
+
+ * main.c (main): `return 0' instead of `exit (0)'. (From Paul Eggert)
+
+ * regexinc.c (SIGN_EXTEND_CHAR): cast to unsigned char.
+ (extract_number, EXTRACT_NUMBER): don't bother to cast here.
+
+Tue Jun 2 07:37:53 1992 Karl Berry (karl@hayley)
+
+ * Version 0.6.
+
+ * Change copyrights to `1985, 89, ...'.
+
+ * regex.h (REG_RE_MATCH_NULL_AT_END): new macro.
+ * xregex.c (re_compile_fastmap): initialize `can_be_null' to
+ `p==pend', instead of in the test at the top of the loop (as
+ it was, it was always being set).
+ Also, set `can_be_null'=1 if we would jump to the end of the
+ pattern in the `on_failure_jump' cases.
+ (re_search_2): check if `can_be_null' is 1, not nonzero. This
+ was the original test in rms' regex; why did we change this?
+
+ * xregex.c (re_compile_fastmap): rename `is_a_succeed_n' to
+ `succeed_n_p'.
+
+Sat May 30 08:09:08 1992 Karl Berry (karl@hayley)
+
+ * xregex.c (re_compile_pattern): declare `regnum' as `unsigned',
+ not `regnum_t', for the benefit of those patterns with more
+ than 255 groups.
+
+ * xregex.c: rename `failure_stack' to `fail_stack', for brevity;
+ likewise for `match_nothing' to `match_null'.
+
+ * regexinc.c (REGEX_REALLOCATE): take both the new and old
+ sizes, and copy only the old bytes.
+ * xregex.c (DOUBLE_FAILURE_STACK): pass both old and new.
+ * This change from Thorsten Ohl.
+
+Fri May 29 11:45:22 1992 Karl Berry (karl@hayley)
+
+ * regexinc.c (SIGN_EXTEND_CHAR): define as `(signed char) c'
+ instead of relying on __CHAR_UNSIGNED__, to work with
+ compilers other than GCC. From Per Bothner.
+
+ * main.c (main): change return type to `int'.
+
+Mon May 18 06:37:08 1992 Karl Berry (karl@hayley)
+
+ * regex.h (RE_SYNTAX_AWK): typo in RE_RE_UNMATCHED...
+
+Fri May 15 10:44:46 1992 Karl Berry (karl@hayley)
+
+ * Version 0.5.
+
+Sun May 3 13:54:00 1992 Karl Berry (karl@hayley)
+
+ * regex.h (struct re_pattern_buffer): now it's just `regs_allocated'.
+ (REGS_UNALLOCATED, REGS_REALLOCATE, REGS_FIXED): new constants.
+ * xregex.c (regexec, re_compile_pattern): set the field appropriately.
+ (re_match_2): and use it. bufp can't be const any more.
+
+Fri May 1 15:43:09 1992 Karl Berry (karl@hayley)
+
+ * regexinc.c: unconditionally include <sys/types.h>, first.
+
+ * regex.h (struct re_pattern_buffer): rename
+ `caller_allocated_regs' to `regs_allocated_p'.
+ * xregex.c (re_compile_pattern): same change here.
+ (regexec): and here.
+ (re_match_2): reallocate registers if necessary.
+
+Fri Apr 10 07:46:50 1992 Karl Berry (karl@hayley)
+
+ * regex.h (RE_SYNTAX{_POSIX,}_AWK): new definitions from Arnold.
+
+Sun Mar 15 07:34:30 1992 Karl Berry (karl at hayley)
+
+ * GNUmakefile (dist): versionize regex.{c,h,texinfo}.
+
+Tue Mar 10 07:05:38 1992 Karl Berry (karl at hayley)
+
+ * Version 0.4.
+
+ * xregex.c (PUSH_FAILURE_POINT): always increment the failure id.
+ (DEBUG_STATEMENT) [DEBUG]: execute the statement even if `debug'==0.
+
+ * xregex.c (pop_failure_point): if the saved string location is
+ null, keep the current value.
+ (re_match_2): at fail, test for a dummy failure point by
+ checking the restored pattern value, not string value.
+ (re_match_2): new case, `on_failure_keep_string_jump'.
+ (regex_compile): output this opcode in the .*\n case.
+ * regexinc.c (re_opcode_t): define the opcode.
+ (partial_compiled_pattern_pattern): add the new case.
+
+Mon Mar 9 09:09:27 1992 Karl Berry (karl at hayley)
+
+ * xregex.c (regex_compile): optimize .*\n to output an
+ unconditional jump to the ., instead of pushing failure points
+ each time through the loop.
+
+ * xregex.c (DOUBLE_FAILURE_STACK): compute the maximum size
+ ourselves (and correctly); change callers.
+
+Sun Mar 8 17:07:46 1992 Karl Berry (karl at hayley)
+
+ * xregex.c (failure_stack_elt_t): change to `const char *', to
+ avoid warnings.
+
+ * regex.h (re_set_syntax): declare this.
+
+ * xregex.c (pop_failure_point) [DEBUG]: conditionally pass the
+ original strings and sizes; change callers.
+
+Thu Mar 5 16:35:35 1992 Karl Berry (karl at claude.cs.umb.edu)
+
+ * xregex.c (regnum_t): new type for register/group numbers.
+ (compile_stack_elt_t, regex_compile): use it.
+
+ * xregex.c (regexec): declare len as `int' to match re_search.
+
+ * xregex.c (re_match_2): don't declare p1 twice.
+
+ * xregex.c: change `while (1)' to `for (;;)' to avoid silly
+ compiler warnings.
+
+ * regex.h [__STDC__]: use #if, not #ifdef.
+
+ * regexinc.c (REGEX_REALLOCATE): cast the result of alloca to
+ (char *), to avoid warnings.
+
+ * xregex.c (regerror): declare variable as const.
+
+ * xregex.c (re_compile_pattern, re_comp): define as returning a const
+ char *.
+ * regex.h (re_compile_pattern, re_comp): likewise.
+
+Thu Mar 5 15:57:56 1992 Karl Berry (karl@hal)
+
+ * xregex.c (regcomp): declare `syntax' as unsigned.
+
+ * xregex.c (re_match_2): try to avoid compiler warnings about
+ unsigned comparisons.
+
+ * GNUmakefile (test-xlc): new target.
+
+ * regex.h (reg_errcode_t): remove trailing comma from definition.
+ * regexinc.c (re_opcode_t): likewise.
+
+Thu Mar 5 06:56:07 1992 Karl Berry (karl at hayley)
+
+ * GNUmakefile (dist): add version numbers automatically.
+ (versionfiles): new variable.
+ (regex.{c,texinfo}): don't add version numbers here.
+ * regex.h: put in placeholder instead of the version number.
+
+Fri Feb 28 07:11:33 1992 Karl Berry (karl at hayley)
+
+ * xregex.c (re_error_msg): declare const, since it is.
+
+Sun Feb 23 05:41:57 1992 Karl Berry (karl at fosse)
+
+ * xregex.c (PAT_PUSH{,_2,_3}, ...): cast args to avoid warnings.
+ (regex_compile, regexec): return REG_NOERROR, instead
+ of 0, on success.
+ (boolean): define as char, and #define false and true.
+ * regexinc.c (STREQ): cast the result.
+
+Sun Feb 23 07:45:38 1992 Karl Berry (karl at hayley)
+
+ * GNUmakefile (test-cc, test-hc, test-pcc): new targets.
+
+ * regex.inc (extract_number, extract_number_and_incr) [DEBUG]:
+ only define if we are debugging.
+
+ * xregex.c [_AIX]: do #pragma alloca first if necessary.
+ * regexinc.c [_AIX]: remove the #pragma from here.
+
+ * regex.h (reg_syntax_t): declare as unsigned, and redo the enum
+ as #define's again. Some compilers do stupid things with enums.
+
+Thu Feb 20 07:19:47 1992 Karl Berry (karl at hayley)
+
+ * Version 0.3.
+
+ * xregex.c, regex.h (newline_anchor_match_p): rename to
+ `newline_anchor'; dumb idea to change the name.
+
+Tue Feb 18 07:09:02 1992 Karl Berry (karl at hayley)
+
+ * regexinc.c: go back to original, i.e., don't include
+ <string.h> or define strchr.
+ * xregex.c (regexec): don't bother with adding characters after
+ newlines to the fastmap; instead, just don't use a fastmap.
+ * xregex.c (regcomp): set the buffer and fastmap fields to zero.
+
+ * xregex.texinfo (GNU r.e. compiling): have to initialize more
+ than two fields.
+
+ * regex.h (struct re_pattern_buffer): rename `newline_anchor' to
+ `newline_anchor_match_p', as we're back to two cases.
+ * xregex.c (regcomp, re_compile_pattern, re_comp): change
+ accordingly.
+ (re_match_2): at begline and endline, POSIX is not a special
+ case anymore; just check newline_anchor_match_p.
+
+Thu Feb 13 16:29:33 1992 Karl Berry (karl at hayley)
+
+ * xregex.c (*empty_string*): rename to *null_string*, for brevity.
+
+Wed Feb 12 06:36:22 1992 Karl Berry (karl at hayley)
+
+ * xregex.c (re_compile_fastmap): at endline, don't set fastmap['\n'].
+ (re_match_2): rewrite the begline/endline cases to take account
+ of the new field newline_anchor.
+
+Tue Feb 11 14:34:55 1992 Karl Berry (karl at hayley)
+
+ * regexinc.c [!USG etc.]: include <strings.h> and define strchr
+ as index.
+
+ * xregex.c (re_search_2): when searching backwards, declare `c'
+ as a char and use casts when using it as an array subscript.
+
+ * xregex.c (regcomp): if REG_NEWLINE, set
+ RE_HAT_LISTS_NOT_NEWLINE. Set the `newline_anchor' field
+ appropriately.
+ (regex_compile): compile [^...] as matching a \n according to
+ the syntax bit.
+ (regexec): if doing REG_NEWLINE stuff, compile a fastmap and add
+ characters after any \n's to the newline.
+ * regex.h (RE_HAT_LISTS_NOT_NEWLINE): new syntax bit.
+ (struct re_pattern_buffer): rename `posix_newline' to
+ `newline_anchor', define constants for its values.
+
+Mon Feb 10 07:22:50 1992 Karl Berry (karl at hayley)
+
+ * xregex.c (re_compile_fastmap): combine the code at the top and
+ bottom of the loop, as it's essentially identical.
+
+Sun Feb 9 10:02:19 1992 Karl Berry (karl at hayley)
+
+ * xregex.texinfo (POSIX Translate Tables): remove this, as it
+ doesn't match the spec.
+
+ * xregex.c (re_compile_fastmap): if we finish off a path, go
+ back to the top (to set can_be_null) instead of returning
+ immediately.
+
+ * xregex.texinfo: changes from bob.
+
+Sat Feb 1 07:03:25 1992 Karl Berry (karl at hayley)
+
+ * xregex.c (re_search_2): doc fix (from rms).
+
+Fri Jan 31 09:52:04 1992 Karl Berry (karl at hayley)
+
+ * xregex.texinfo (GNU Searching): clarify the range arg.
+
+ * xregex.c (re_match_2, at_endline_op_p): add extra parens to
+ get rid of GCC 2's (silly, IMHO) warning about && within ||.
+
+ * xregex.c (common_op_match_empty_string_p): use
+ MATCH_NOTHING_UNSET_VALUE, not -1.
+
+Thu Jan 16 08:43:02 1992 Karl Berry (karl at hayley)
+
+ * xregex.c (SET_REGS_MATCHED): only set the registers from
+ lowest to highest.
+
+ * regexinc.c (MIN): new macro.
+ * xregex.c (re_match_2): only check min (num_regs,
+ regs->num_regs) when we set the returned regs.
+
+ * xregex.c (re_match_2): set registers after the first
+ num_regs to -1 before we return.
+
+Tue Jan 14 16:01:42 1992 Karl Berry (karl at hayley)
+
+ * xregex.c (re_match_2): initialize max (RE_NREGS, re_nsub + 1)
+ registers (from rms).
+
+ * xregex.c, regex.h: don't abbreviate `19xx' to `xx'.
+
+ * regexinc.c [!emacs]: include <sys/types.h> before <unistd.h>.
+ (from ro@thp.Uni-Koeln.DE).
+
+Thu Jan 9 07:23:00 1992 Karl Berry (karl at hayley)
+
+ * xregex.c (*unmatchable): rename to `match_empty_string_p'.
+ (CAN_MATCH_NOTHING): rename to `REG_MATCH_EMPTY_STRING_P'.
+
+ * regexinc.c (malloc, realloc): remove prototypes, as they can
+ cause clashes (from rms).
+
+Mon Jan 6 12:43:24 1992 Karl Berry (karl at claude.cs.umb.edu)
+
+ * Version 0.2.
+
+Sun Jan 5 10:50:38 1992 Karl Berry (karl at hayley)
+
+ * xregex.texinfo: bring more or less up-to-date.
+ * GNUmakefile (regex.texinfo): generate from regex.h and
+ xregex.texinfo.
+ * include.awk: new file.
+
+ * xregex.c: change all calls to the fn extract_number_and_incr
+ to the macro.
+
+ * xregex.c (re_match_2) [emacs]: in at_dot, use PTR_CHAR_POS + 1,
+ instead of bf_* and sl_*. Cast d to unsigned char *, to match
+ the declaration in Emacs' buffer.h.
+ [emacs19]: in before_dot, at_dot, and after_dot, likewise.
+
+ * regexinc.c: unconditionally include <sys/types.h>.
+
+ * regexinc.c (alloca) [!alloca]: Emacs config files sometimes
+ define this, so don't define it if it's already defined.
+
+Sun Jan 5 06:06:53 1992 Karl Berry (karl at fosse)
+
+ * xregex.c (re_comp): fix type conflicts with regex_compile (we
+ haven't been compiling this).
+
+ * regexinc.c (SIGN_EXTEND_CHAR): use `__CHAR_UNSIGNED__', not
+ `CHAR_UNSIGNED'.
+
+ * regexinc.c (NULL) [!NULL]: define it (as zero).
+
+ * regexinc.c (extract_number): remove the temporaries.
+
+Sun Jan 5 07:50:14 1992 Karl Berry (karl at hayley)
+
+ * regex.h (regerror) [!__STDC__]: return a size_t, not a size_t *.
+
+ * xregex.c (PUSH_FAILURE_POINT, ...): declare `destination' as
+ `char *' instead of `void *', to match alloca declaration.
+
+ * xregex.c (regerror): use `size_t' for the intermediate values
+ as well as the return type.
+
+ * xregex.c (regexec): cast the result of malloc.
+
+ * xregex.c (regexec): don't initialize `private_preg' in the
+ declaration, as old C compilers can't do that.
+
+ * xregex.c (main) [test]: declare printchar void.
+
+ * xregex.c (assert) [!DEBUG]: define this to do nothing, and
+ remove #ifdef DEBUG's from around asserts.
+
+ * xregex.c (re_match_2): remove error message when not debugging.
+
+Sat Jan 4 09:45:29 1992 Karl Berry (karl at hayley)
+
+ * other.c: test the bizarre duplicate case in re_compile_fastmap
+ that I just noticed.
+
+ * test.c (general_test): don't test registers beyond the end of
+ correct_regs, as well as regs.
+
+ * xregex.c (regex_compile): at handle_close, don't assign to
+ *inner_group_loc if we didn't push a start_memory (because the
+ group number was too big). In fact, don't push or pop the
+ inner_group_offset in that case.
+
+ * regex.c: rename to xregex.c, since it's not the whole thing.
+ * regex.texinfo: likewise.
+ * GNUmakefile: change to match.
+
+ * regex.c [DEBUG]: only include <stdio.h> if debugging.
+
+ * regexinc.c (SIGN_EXTEND_CHAR) [CHAR_UNSIGNED]: if it's already
+ defined, don't redefine it.
+
+ * regex.c: define _GNU_SOURCE at the beginning.
+ * regexinc.c (isblank) [!isblank]: define it.
+ (isgraph) [!isgraph]: change conditional to this, and remove the
+ sequent stuff.
+
+ * regex.c (regex_compile): add `blank' character class.
+
+ * regex.c (regex_compile): don't use a uchar variable to loop
+ through all characters.
+
+ * regex.c (regex_compile): at '[', improve logic for checking
+ that we have enough space for the charset.
+
+ * regex.h (struct re_pattern_buffer): declare translate as char
+ * again. We only use it as an array subscript once, I think.
+
+ * regex.c (TRANSLATE): new macro to cast the data character
+ before subscripting.
+ (num_internal_regs): rename to `num_regs'.
+
+Fri Jan 3 07:58:01 1992 Karl Berry (karl at hayley)
+
+ * regex.h (struct re_pattern_buffer): declare `allocated' and
+ `used' as unsigned long, since these are never negative.
+
+ * regex.c (compile_stack_element): rename to compile_stack_elt_t.
+ (failure_stack_element): similarly.
+
+ * regexinc.c (TALLOC, RETALLOC): new macros to simplify
+ allocation of arrays.
+
+ * regex.h (re_*) [__STDC__]: don't declare string args unsigned
+ char *; that makes them incompatible with string constants.
+ (struct re_pattern_buffer): declare the pattern and translate
+ table as unsigned char *.
+ * regex.c (most routines): use unsigned char vs. char consistently.
+
+ * regex.h (re_compile_pattern): do not declare the length arg as
+ const.
+ * regex.c (re_compile_pattern): likewise.
+
+ * regex.c (POINTER_TO_REG): rename to `POINTER_TO_OFFSET'.
+
+ * regex.h (re_registers): declare `start' and `end' as
+ `regoff_t', instead of `int'.
+
+ * regex.c (regexec): if either of the malloc's for the register
+ information fail, return failure.
+
+ * regex.h (RE_NREGS): define this again, as 30 (from jla).
+ (RE_ALLOCATE_REGISTERS): remove this.
+ (RE_SYNTAX_*): remove it from definitions.
+ (re_pattern_buffer): remove `return_default_num_regs', add
+ `caller_allocated_regs'.
+ * regex.c (re_compile_pattern): clear no_sub and
+ caller_allocated_regs in the pattern.
+ (regcomp): set caller_allocated_regs.
+ (re_match_2): do all register allocation at the end of the
+ match; implement new semantics.
+
+ * regex.c (MAX_REGNUM): new macro.
+ (regex_compile): at handle_open and handle_close, if the group
+ number is too large, don't push the start/stop memory.
+
+Thu Jan 2 07:56:10 1992 Karl Berry (karl at hayley)
+
+ * regex.c (re_match_2): if the back reference is to a group that
+ never matched, then goto fail, not really_fail. Also, don't
+ test if the pattern can match the empty string. Why did we
+ ever do that?
+ (really_fail): this label no longer needed.
+
+ * regexinc.c [STDC_HEADERS]: use only this to test if we should
+ include <stdlib.h>.
+
+ * regex.c (DO_RANGE, regex_compile): translate in all cases
+ except the single character after a \.
+
+ * regex.h (RE_AWK_CLASS_HACK): rename to
+ RE_BACKSLASH_ESCAPE_IN_LISTS.
+ * regex.c (regex_compile): change use.
+
+ * regex.c (re_compile_fastmap): do not translate the characters
+ again; we already translated them at compilation. (From ylo@ngs.fi.)
+
+ * regex.c (re_match_2): in case for at_dot, invert sense of
+ comparison and find the character number properly. (From
+ worley@compass.com.)
+ (re_match_2) [emacs]: remove the cases for before_dot and
+ after_dot, since there's no way to specify them, and the code
+ is wrong (judging from this change).
+
+Wed Jan 1 09:13:38 1992 Karl Berry (karl at hayley)
+
+ * psx-{interf,basic,extend}.c, other.c: set `t' as the first
+ thing, so that if we run them in sucession, general_test's
+ kludge to see if we're doing POSIX tests works.
+
+ * test.h (test_type): add `all_test'.
+ * main.c: add case for `all_test'.
+
+ * regexinc.c (partial_compiled_pattern_printer,
+ double_string_printer): don't print anything if we're passed null.
+
+ * regex.c (PUSH_FAILURE_POINT): do not scan for the highest and
+ lowest active registers.
+ (re_match_2): compute lowest/highest active regs at start_memory and
+ stop_memory.
+ (NO_{LOW,HIGH}EST_ACTIVE_REG): new sentinel values.
+ (pop_failure_point): return the lowest/highest active reg values
+ popped; change calls.
+
+ * regex.c [DEBUG]: include <assert.h>.
+ (various routines) [DEBUG]: change conditionals to assertions.
+
+ * regex.c (DEBUG_STATEMENT): new macro.
+ (PUSH_FAILURE_POINT): use it to increment num_regs_pushed.
+ (re_match_2) [DEBUG]: only declare num_regs_pushed if DEBUG.
+
+ * regex.c (*can_match_nothing): rename to *unmatchable.
+
+ * regex.c (re_match_2): at stop_memory, adjust argument reading.
+
+ * regex.h (re_pattern_buffer): declare `can_be_null' as a 2-bit
+ bit field.
+
+ * regex.h (re_pattern_buffer): declare `buffer' unsigned char *;
+ no, dumb idea. The pattern can have signed number.
+
+ * regex.c (re_match_2): in maybe_pop_jump case, skip over the
+ right number of args to the group operators, and don't do
+ anything with endline if posix_newline is not set.
+
+ * regex.c, regexinc.c (all the things we just changed): go back
+ to putting the inner group count after the start_memory,
+ because we need it in the on_failure_jump case in re_match_2.
+ But leave it after the stop_memory also, since we need it
+ there in re_match_2, and we don't have any way of getting back
+ to the start_memory.
+
+ * regexinc.c (partial_compiled_pattern_printer): adjust argument
+ reading for start/stop_memory.
+ * regex.c (re_compile_fastmap, group_can_match_nothing): likewise.
+
+Tue Dec 31 10:15:08 1991 Karl Berry (karl at hayley)
+
+ * regex.c (bits list routines): remove these.
+ (re_match_2): get the number of inner groups from the pattern,
+ instead of keeping track of it at start and stop_memory.
+ Put the count after the stop_memory, not after the
+ start_memory.
+ (compile_stack_element): remove `fixup_inner_group' member,
+ since we now put it in when we can compute it.
+ (regex_compile): at handle_open, don't push the inner group
+ offset, and at handle_close, don't pop it.
+
+ * regex.c (level routines): remove these, and their uses in
+ regex_compile. This was another manifestation of having to find
+ $'s that were endlines.
+
+ * regex.c (regexec): this does searching, not matching (a
+ well-disguised part of the standard). So rewrite to use
+ `re_search' instead of `re_match'.
+ * psx-interf.c (test_regexec): add tests to, uh, match.
+
+ * regex.h (RE_TIGHT_ALT): remove this; nobody uses it.
+ * regex.c: remove the code that was supposed to implement it.
+
+ * other.c (test_others): ^ and $ never match newline characters;
+ RE_CONTEXT_INVALID_OPS doesn't affect anchors.
+
+ * psx-interf.c (test_regerror): update for new error messages.
+
+ * psx-extend.c: it's now ok to have an alternative be just a $,
+ so remove all the tests which supposed that was invalid.
+
+Wed Dec 25 09:00:05 1991 Karl Berry (karl at hayley)
+
+ * regex.c (regex_compile): in handle_open, don't skip over ^ and
+ $ when checking for an empty group. POSIX has changed the
+ grammar.
+ * psx-extend.c (test_posix_extended): thus, move (^$) tests to
+ valid section.
+
+ * regexinc.c (boolean): move from here to test.h and regex.c.
+ * test files: declare verbose, omit_register_tests, and
+ test_should_match as boolean.
+
+ * psx-interf.c (test_posix_c_interface): remove the `c_'.
+ * main.c: likewise.
+
+ * psx-basic.c (test_posix_basic): ^ ($) is an anchor after
+ (before) an open (close) group.
+
+ * regex.c (re_match_2): in endline, correct precedence of
+ posix_newline condition.
+
+Tue Dec 24 06:45:11 1991 Karl Berry (karl at hayley)
+
+ * test.h: incorporate private-tst.h.
+ * test files: include test.h, not private-tst.h.
+
+ * test.c (general_test): set posix_newline to zero if we are
+ doing POSIX tests (unfortunately, it's difficult to call
+ regcomp in this case, which is what we should really be doing).
+
+ * regex.h (reg_syntax_t): make this an enumeration type which
+ defines the syntax bits; renames re_syntax_t.
+
+ * regex.c (at_endline_op_p): don't preincrement p; then if it's
+ not an empty string op, we lose.
+
+ * regex.h (reg_errcode_t): new enumeration type of the error
+ codes.
+ * regex.c (regex_compile): return that type.
+
+ * regex.c (regex_compile): in [, initialize
+ just_had_a_char_class to false; somehow I had changed this to
+ true.
+
+ * regex.h (RE_NO_CONSECUTIVE_REPEATS): remove this, since we
+ don't use it, and POSIX doesn't require this behavior anymore.
+ * regex.c (regex_compile): remove it from here.
+
+ * regex.c (regex_compile): remove the no_op insertions for
+ verify_and_adjust_endlines, since that doesn't exist anymore.
+
+ * regex.c (regex_compile) [DEBUG]: use printchar to print the
+ pattern, so unprintable bytes will print properly.
+
+ * regex.c: move re_error_msg back.
+ * test.c (general_test): print the compile error if the pattern
+ was invalid.
+
+Mon Dec 23 08:54:53 1991 Karl Berry (karl at hayley)
+
+ * regexinc.c: move re_error_msg here.
+
+ * regex.c (re_error_msg): the ``message'' for success must be
+ NULL, to keep the interface to re_compile_pattern the same.
+ (regerror): if the msg is null, use "Success".
+
+ * rename most test files for consistency. Change Makefile
+ correspondingly.
+
+ * test.c (most routines): add casts to (unsigned char *) when we
+ call re_{match,search}{,_2}.
+
+Sun Dec 22 09:26:06 1991 Karl Berry (karl at hayley)
+
+ * regex.c (re_match_2): declare string args as unsigned char *
+ again; don't declare non-pointer args const; declare the
+ pattern buffer const.
+ (re_match): likewise.
+ (re_search_2, re_search): likewise, except don't declare the
+ pattern const, since we make a fastmap.
+ * regex.h [__STDC__]: change prototypes.
+
+ * regex.c (regex_compile): return an error code, not a string.
+ (re_err_list): new table to map from error codes to string.
+ (re_compile_pattern): return an element of re_err_list.
+ (regcomp): don't test all the strings.
+ (regerror): just use the list.
+ (put_in_buffer): remove this.
+
+ * regex.c (equivalent_failure_points): remove this.
+
+ * regex.c (re_match_2): don't copy the string arguments into
+ non-const pointers. We never alter the data.
+
+ * regex.c (re_match_2): move assignment to `is_a_jump_n' out of
+ the main loop. Just initialize it right before we do
+ something with it.
+
+ * regex.[ch] (re_match_2): don't declare the int parameters const.
+
+Sat Dec 21 08:52:20 1991 Karl Berry (karl at hayley)
+
+ * regex.h (re_syntax_t): new type; declare to be unsigned
+ (previously we used int, but since we do bit operations on
+ this, unsigned is better, according to H&S).
+ (obscure_syntax, re_pattern_buffer): use that type.
+ * regex.c (re_set_syntax, regex_compile): likewise.
+
+ * regex.h (re_pattern_buffer): new field `posix_newline'.
+ * regex.c (re_comp, re_compile_pattern): set to zero.
+ (regcomp): set to REG_NEWLINE.
+ * regex.h (RE_HAT_LISTS_NOT_NEWLINE): remove this (we can just
+ check `posix_newline' instead.)
+
+ * regex.c (op_list_type, op_list, add_op): remove these.
+ (verify_and_adjust_endlines): remove this.
+ (pattern_offset_list_type, *pattern_offset* routines): and these.
+ These things all implemented the nonleading/nontrailing position
+ code, which was very long, had a few remaining problems, and
+ is no longer needed. So...
+
+ * regexinc.c (STREQ): new macro to abbreviate strcmp(,)==0, for
+ brevity. Change various places in regex.c to use it.
+
+ * regex{,inc}.c (enum regexpcode): change to a typedef
+ re_opcode_t, for brevity.
+
+ * regex.h (re_syntax_table) [SYNTAX_TABLE]: remove this; it
+ should only be in regex.c, I think, since we don't define it
+ in this case. Maybe it should be conditional on !SYNTAX_TABLE?
+
+ * regexinc.c (partial_compiled_pattern_printer): simplify and
+ distinguish the emacs/not-emacs (not)wordchar cases.
+
+Fri Dec 20 08:11:38 1991 Karl Berry (karl at hayley)
+
+ * regexinc.c (regexpcode) [emacs]: only define the Emacs opcodes
+ if we are ifdef emacs.
+
+ * regex.c (BUF_PUSH*): rename to PAT_PUSH*.
+
+ * regex.c (regex_compile): in $ case, go back to essentially the
+ original code for deciding endline op vs. normal char.
+ (at_endline_op_p): new routine.
+ * regex.h (RE_ANCHORS_ONLY_AT_ENDS, RE_CONTEXT_INVALID_ANCHORS,
+ RE_REPEATED_ANCHORS_AWAY, RE_NO_ANCHOR_AT_NEWLINE): remove
+ these. POSIX has simplified the rules for anchors in draft
+ 11.2.
+ (RE_NEWLINE_ORDINARY): new syntax bit.
+ (RE_CONTEXT_INDEP_ANCHORS): change description to be compatible
+ with POSIX.
+ * regex.texinfo (Syntax Bits): remove the descriptions.
+
+Mon Dec 16 08:12:40 1991 Karl Berry (karl at hayley)
+
+ * regex.c (re_match_2): in jump_past_next_alt, unconditionally
+ goto no_pop. The only register we were finding was one which
+ enclosed the whole alternative expression, not one around an
+ individual alternative. So we were never doing what we
+ thought we were doing, and this way makes (|a) against the
+ empty string fail.
+
+ * regex.c (regex_compile): remove `highest_ever_regnum', and
+ don't restore regnum from the stack; just put it into a
+ temporary to put into the stop_memory. Otherwise, groups
+ aren't numbered consecutively.
+
+ * regex.c (is_in_compile_stack): rename to
+ `group_in_compile_stack'; remove unnecessary test for the
+ stack being empty.
+
+ * regex.c (re_match_2): in on_failure_jump, skip no_op's before
+ checking for the start_memory, in case we were called from
+ succeed_n.
+
+Sun Dec 15 16:20:48 1991 Karl Berry (karl at hayley)
+
+ * regex.c (regex_compile): in duplicate case, use
+ highest_ever_regnum instead of regnum, since the latter is
+ reverted at stop_memory.
+
+ * regex.c (re_match_2): in on_failure_jump, if the * applied to
+ a group, save the information for that group and all inner
+ groups (by making it active), even though we're not inside it
+ yet.
+
+Sat Dec 14 09:50:59 1991 Karl Berry (karl at hayley)
+
+ * regex.c (PUSH_FAILURE_ITEM, POP_FAILURE_ITEM): new macros.
+ Use them instead of copying the stack manipulating a zillion
+ times.
+
+ * regex.c (PUSH_FAILURE_POINT, pop_failure_point) [DEBUG]: save
+ and restore a unique identification value for each failure point.
+
+ * regexinc.c (partial_compiled_pattern_printer): don't print an
+ extra / after duplicate commands.
+
+ * regex.c (regex_compile): in back-reference case, allow a back
+ reference to register `regnum'. Otherwise, even `\(\)\1'
+ fails, since regnum is 1 at the back-reference.
+
+ * regex.c (re_match_2): in fail, don't examine the pattern if we
+ restored to pend.
+
+ * test_private.h: rename to private_tst.h. Change includes.
+
+ * regex.c (extend_bits_list): compute existing size for realloc
+ in bytes, not blocks.
+
+ * regex.c (re_match_2): in jump_past_next_alt, the for loop was
+ missing its (empty) statement. Even so, some register tests
+ still fail, although in a different way than in the previous change.
+
+Fri Dec 13 15:55:08 1991 Karl Berry (karl at hayley)
+
+ * regex.c (re_match_2): in jump_past_next_alt, unconditionally
+ goto no_pop, since we weren't properly detecting if the
+ alternative matched something anyway. No, we need to not jump
+ to keep the register values correct; just change to not look at
+ register zero and not test RE_NO_EMPTY_ALTS (which is a
+ compile-time thing).
+
+ * regex.c (SET_REGS_MATCHED): start the loop at 1, since we never
+ care about register zero until the very end. (I think.)
+
+ * regex.c (PUSH_FAILURE_POINT, pop_failure_point): go back to
+ pushing and popping the active registers, instead of only doing
+ the registers before a group: (fooq|fo|o)*qbar against fooqbar
+ fails, since we restore back into the middle of group 1, yet it
+ isn't active, because the previous restore clobbered the active flag.
+
+Thu Dec 12 17:25:36 1991 Karl Berry (karl at hayley)
+
+ * regex.c (PUSH_FAILURE_POINT): do not call
+ `equivalent_failure_points' after all; it causes the registers
+ to be ``wrong'' (according to POSIX), and an infinite loop on
+ `((a*)*)*' against `ab'.
+
+ * regex.c (re_compile_fastmap): don't push `pend' on the failure
+ stack.
+
+Tue Dec 10 10:30:03 1991 Karl Berry (karl at hayley)
+
+ * regex.c (PUSH_FAILURE_POINT): if pushing same failure point that
+ is on the top of the stack, fail.
+ (equivalent_failure_points): new routine.
+
+ * regex.c (re_match_2): add debug statements for every opcode we
+ execute.
+
+ * regex.c (regex_compile/handle_close): restore
+ `fixup_inner_group_count' and `regnum' from the stack.
+
+Mon Dec 9 13:51:15 1991 Karl Berry (karl at hayley)
+
+ * regex.c (PUSH_FAILURE_POINT): declare `this_reg' as int, so
+ unsigned arithmetic doesn't happen when we don't want to save
+ the registers.
+
+Tue Dec 3 08:11:10 1991 Karl Berry (karl at hayley)
+
+ * regex.c (extend_bits_list): divide size by bits/block.
+
+ * regex.c (init_bits_list): remove redundant assignmen to
+ `bits_list_ptr'.
+
+ * regexinc.c (partial_compiled_pattern_printer): don't do *p++
+ twice in the same expr.
+
+ * regex.c (re_match_2): at on_failure_jump, use the correct
+ pattern positions for getting the stuff following the start_memory.
+
+ * regex.c (struct register_info): remove the bits_list for the
+ inner groups; make that a separate variable.
+
+Mon Dec 2 10:42:07 1991 Karl Berry (karl at hayley)
+
+ * regex.c (PUSH_FAILURE_POINT): don't pass `failure_stack' as an
+ arg; change callers.
+
+ * regex.c (PUSH_FAILURE_POINT): print items in order they are
+ pushed.
+ (pop_failure_point): likewise.
+
+ * regex.c (main): prompt for the pattern and string.
+
+ * regex.c (FREE_VARIABLES) [!REGEX_MALLOC]: declare as nothing;
+ remove #ifdefs from around calls.
+
+ * regex.c (extract_number, extract_number_and_incr): declare static.
+
+ * regex.c: remove the canned main program.
+ * main.c: new file.
+ * Makefile (COMMON): add main.o.
+
+Tue Sep 24 06:26:51 1991 Kathy Hargreaves (kathy at fosse)
+
+ * regex.c (re_match_2): Made `pend' and `dend' not register variables.
+ Only set string2 to string1 if string1 isn't null.
+ Send address of p, d, regstart, regend, and reg_info to
+ pop_failure_point.
+ Put in more debug statements.
+
+ * regex.c [debug]: Added global variable.
+ (DEBUG_*PRINT*): Only print if `debug' is true.
+ (DEBUG_DOUBLE_STRING_PRINTER): Changed DEBUG_STRING_PRINTER's
+ name to this.
+ Changed some comments.
+ (PUSH_FAILURE_POINT): Moved and added some debugging statements.
+ Was saving regstart on the stack twice instead of saving both
+ regstart and regend; remedied this.
+ [NUM_REGS_ITEMS]: Changed from 3 to 4, as now save lowest and
+ highest active registers instead of highest used one.
+ [NUM_NON_REG_ITEMS]: Changed name of NUM_OTHER_ITEMS to this.
+ (NUM_FAILURE_ITEMS): Use active registers instead of number 0
+ through highest used one.
+ (re_match_2): Have pop_failure_point put things in the variables.
+ (pop_failure_point): Have it do what the fail case in re_match_2
+ did with the failure stack, instead of throwing away the stuff
+ popped off. re_match_2 can ignore results when it doesn't
+ need them.
+
+
+Thu Sep 5 13:23:28 1991 Kathy Hargreaves (kathy at fosse)
+
+ * regex.c (banner): Changed copyright years to be separate.
+
+ * regex.c [CHAR_UNSIGNED]: Put __ at both ends of this name.
+ [DEBUG, debug_count, *debug_p, DEBUG_PRINT_1, DEBUG_PRINT_2,
+ DEBUG_COMPILED_PATTERN_PRINTER ,DEBUG_STRING_PRINTER]:
+ defined these for debugging.
+ (extract_number): Added this (debuggable) routine version of
+ the macro EXTRACT_NUMBER. Ditto for EXTRACT_NUMBER_AND_INCR.
+ (re_compile_pattern): Set return_default_num_regs if the
+ syntax bit RE_ALLOCATE_REGISTERS is set.
+ [REGEX_MALLOC]: Renamed USE_ALLOCA to this.
+ (BUF_POP): Got rid of this, as don't ever use it.
+ (regex_compile): Made the type of `pattern' not be register.
+ If DEBUG, print the pattern to compile.
+ (re_match_2): If had a `$' in the pattern before a `^' then
+ don't record the `^' as an anchor.
+ Put (enum regexpcode) before references to b, as suggested
+ [RE_NO_BK_BRACES]: Changed RE_NO_BK_CURLY_BRACES to this.
+ (remove_pattern_offset): Removed this unused routine.
+ (PUSH_FAILURE_POINT): Changed to only save active registers.
+ Put in debugging statements.
+ (re_compile_fastmap): Made `pattern' not a register variable.
+ Use routine for extracting numbers instead of macro.
+ (re_match_2): Made `p', `mcnt' and `mcnt2' not register variables.
+ Added `num_regs_pushed' for debugging.
+ Only malloc registers if the syntax bit RE_ALLOCATE_REGISTERS is set.
+ Put in debug statements.
+ Put the macro NOTE_INNER_GROUP's code inline, as it was the
+ only called in one place.
+ For debugging, extract numbers using routines instead of macros.
+ In case fail: only restore pushed active registers, and added
+ debugging statements.
+ (pop_failure_point): Test for underfull stack.
+ (group_can_match_nothing, common_op_can_match_nothing): For
+ debugging, extract numbers using routines instead of macros.
+ (regexec): Changed formal parameters to not be prototypes.
+ Don't initialize `regs' or `private_preg' in their declarations.
+
+Tue Jul 23 18:38:36 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h [RE_CONTEX_INDEP_OPS]: Moved the anchor stuff out of
+ this bit.
+ [RE_UNMATCHED_RIGHT_PAREN_ORD]: Defined this bit.
+ [RE_CONTEXT_INVALID_ANCHORS]: Defined this bit.
+ [RE_CONTEXT_INDEP_ANCHORS]: Defined this bit.
+ Added RE_CONTEXT_INDEP_ANCHORS to all syntaxes which had
+ RE_CONTEXT_INDEP_OPS.
+ Took RE_ANCHORS_ONLY_AT_ENDS out of the POSIX basic syntax.
+ Added RE_UNMATCHED_RIGHT_PAREN_ORD to the POSIX extended
+ syntax.
+ Took RE_REPEATED_ANCHORS_AWAY out of the POSIX extended syntax.
+ Defined REG_NOERROR (which will probably have to go away again).
+ Changed the type `off_t' to `regoff_t'.
+
+ * regex.c: Changed some commments.
+ (regex_compile): Added variable `had_an_endline' to keep track
+ of if hit a `$' since the beginning of the pattern or the last
+ alternative (if any).
+ Changed RE_CONTEXT_INVALID_OPS and RE_CONTEXT_INDEP_OPS to
+ RE_CONTEXT_INVALID_ANCHORS and RE_CONTEXT_INDEP_ANCHORS where
+ appropriate.
+ Put a `no_op' in the pattern if a repeat is only zero or one
+ times; in this case and if it is many times (whereupon a jump
+ backwards is pushed instead), keep track of the operator for
+ verify_and_adjust_endlines.
+ If RE_UNMATCHED_RIGHT_PAREN is set, make an unmatched
+ close-group operator match `)'.
+ Changed all error exits to exit (1).
+ (remove_pattern_offset): Added this routine, but don't use it.
+ (verify_and_adjust_endlines): At top of routine, if initialize
+ routines run out of memory, return true after setting
+ enough_memory false.
+ At end of endline, et al. case, don't set *p to no_op.
+ Repetition operators also set the level and active groups'
+ match statuses, unless RE_REPEATED_ANCHORS_AWAY is set.
+ (get_group_match_status): Put a return in front of call to get_bit.
+ (re_compile_fastmap): Changed is_a_succeed_n to a boolean.
+ If at end of pattern, then if the failure stack isn't empty,
+ go back to the failure point.
+ In *jump* case, only pop the stack if what's on top of it is
+ where we've just jumped to.
+ (re_search_2): Return -2 instead of val if val is -2.
+ (group_can_match_nothing, alternative_can_match_nothing,
+ common_op_can-match_nothing): Now pass in reg_info for the
+ `duplicate' case.
+ (re_match_2): Don't skip over the next alternative also if
+ empty alternatives aren't allowed.
+ In fail case, if failed to a backwards jump that's part of a
+ repetition loop, pop the current failure point and use the
+ next one.
+ (pop_failure_point): Check that there's as many register items
+ on the failure stack as the stack says there are.
+ (common_op_can_match_nothing): Added variables `ret' and
+ `reg_no' so can set reg_info for the group encountered.
+ Also break without doing anything if hit a no_op or the other
+ kinds of `endline's.
+ If not done already, set reg_info in start_memory case.
+ Put in no_pop_jump for an optimized succeed_n of zero repetitions.
+ In succeed_n case, if the number isn't zero, then return false.
+ Added `duplicate' case.
+
+Sat Jul 13 11:27:38 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h (REG_NOERROR): Added this error code definition.
+
+ * regex.c: Took some redundant parens out of macros.
+ (enum regexpcode): Added jump_past_next_alt.
+ Wrapped some macros in `do..while (0)'.
+ Changed some comments.
+ (regex_compile): Use `fixup_alt_jump' instead of `fixup_jump'.
+ Use `maybe_pop_jump' instead of `maybe_pop_failure_jump'.
+ Use `jump_past_next_alt' instead of `no_pop_jump' when at the
+ end of an alternative.
+ (re_match_2): Used REGEX_ALLOCATE for the registers stuff.
+ In stop_memory case: Add more boolean tests to see if the
+ group is in a loop.
+ Added jump_past_next_alt case, which doesn't jump over the
+ next alternative if the last one didn't match anything.
+ Unfortunately, to make this work with, e.g., `(a+?*|b)*'
+ against `bb', I also had to pop the alternative's failure
+ point, which in turn broke backtracking!
+ In fail case: Detect a dummy failure point by looking at
+ failure_stack.avail - 2, not stack[-2].
+ (pop_failure_point): Only pop if the stack isn't empty; don't
+ give an error if it is. (Not sure yet this is correct.)
+ (group_can_match_nothing): Make it return a boolean instead of int.
+ Make it take an argument indicating the end of where it should look.
+ If find a group that can match nothing, set the pointer
+ argument to past the group in the pattern.
+ Took out cases which can share with alternative_can_match_nothing
+ and call common_op_can_match_nothing.
+ Took ++ out of switch, so could call common_op_can_match_nothing.
+ Wrote lots more for on_failure_jump case to handle alternatives.
+ Main loop now doesn't look for matching stop_memory, but
+ rather the argument END; return true if hit the matching
+ stop_memory; this way can call itself for inner groups.
+ (alternative_can_match_nothing): Added for alternatives.
+ (common_op_can_match_nothing): Added for previous two routines'
+ common operators.
+ (regerror): Returns a message saying there's no error if gets
+ sent REG_NOERROR.
+
+Wed Jul 3 10:43:15 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.c: Removed unnecessary enclosing parens from several macros.
+ Put `do..while (0)' around a few.
+ Corrected some comments.
+ (INIT_FAILURE_STACK_SIZE): Deleted in favor of using
+ INIT_FAILURE_ALLOC.
+ (INIT_FAILURE_STACK, DOUBLE_FAILURE_STACK, PUSH_PATTERN_OP,
+ PUSH_FAILURE_POINT): Made routines of the same name (but with all
+ lowercase letters) into these macros, so could use `alloca'
+ when USE_ALLOCA is defined. The reason is stated below for
+ bits lists. Deleted analogous routines.
+ (re_compile_fastmap): Added variable void *destination for
+ PUSH_PATTERN_OP.
+ (re_match_2): Added variable void *destination for REGEX_REALLOCATE.
+ Used the failure stack macros in place of the routines.
+ Detected a dummy failure point by inspecting the failure stack's
+ (avail - 2)th element, not failure_stack.stack[-2]. This bug
+ arose when used the failure stack macros instead of the routines.
+
+ * regex.c [USE_ALLOCA]: Put this conditional around previous
+ alloca stuff and defined these to work differently depending
+ on whether or not USE_ALLOCA is defined:
+ (REGEX_ALLOCATE): Uses either `alloca' or `malloc'.
+ (REGEX_REALLOCATE): Uses either `alloca' or `realloc'.
+ (INIT_BITS_LIST, EXTEND_BITS_LIST, SET_BIT_TO_VALUE): Defined
+ macro versions of routines with the same name (only with all
+ lowercase letters) so could use `alloc' in re_match_2. This
+ is to prevent core leaks when C-g is used in Emacs and to make
+ things faster and avoid storage fragmentation. These things
+ have to be macros because the results of `alloca' go away with
+ the routine by which it's called.
+ (BITS_BLOCK_SIZE, BITS_BLOCK, BITS_MASK): Moved to above the
+ above-mentioned macros instead of before the routines defined
+ below regex_compile.
+ (set_bit_to_value): Compacted some code.
+ (reg_info_type): Changed inner_groups field to be bits_list_type
+ so could be arbitrarily long and thus handle arbitrary nesting.
+ (NOTE_INNER_GROUP): Put `do...while (0)' around it so could
+ use as a statement.
+ Changed code to use bits lists.
+ Added variable void *destination for REGEX_REALLOCATE (whose call
+ is several levels in).
+ Changed variable name of `this_bit' to `this_reg'.
+ (FREE_VARIABLES): Only define and use if USE_ALLOCA is defined.
+ (re_match_2): Use REGEX_ALLOCATE instead of malloc.
+ Instead of setting INNER_GROUPS of reg_info to zero, have to
+ use INIT_BITS_LIST and return -2 (and free variables if
+ USE_ALLOCA isn't defined) if it fails.
+
+Fri Jun 28 13:45:07 1991 Karl Berry (karl at hayley)
+
+ * regex.c (re_match_2): set value of `dend' when we restore `d'.
+
+ * regex.c: remove declaration of alloca.
+
+ * regex.c (MISSING_ISGRAPH): rename to `ISGRAPH_MISSING'.
+
+ * regex.h [_POSIX_SOURCE]: remove these conditionals; always
+ define POSIX stuff.
+ * regex.c (_POSIX_SOURCE): change conditionals to use `POSIX'
+ instead.
+
+Sat Jun 1 16:56:50 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.*: Changed RE_CONTEXTUAL_* to RE_CONTEXT_*,
+ RE_TIGHT_VBAR to RE_TIGHT_ALT, RE_NEWLINE_OR to
+ RE_NEWLINE_ALT, and RE_DOT_MATCHES_NEWLINE to RE_DOT_NEWLINE.
+
+Wed May 29 09:24:11 1991 Karl Berry (karl at hayley)
+
+ * regex.texinfo (POSIX Pattern Buffers): cross-reference the
+ correct node name (Match-beginning-of-line, not ..._line).
+ (Syntax Bits): put @code around all syntax bits.
+
+Sat May 18 16:29:58 1991 Karl Berry (karl at hayley)
+
+ * regex.c (global): add casts to keep broken compilers from
+ complaining about malloc and realloc calls.
+
+ * regex.c (isgraph) [MISSING_ISGRAPH]: change test to this,
+ instead of `#ifndef isgraph', since broken compilers can't
+ have both a macro and a symbol by the same name.
+
+ * regex.c (re_comp, re_exec) [_POSIX_SOURCE]: do not define.
+ (regcomp, regfree, regexec, regerror) [_POSIX_SOURCE && !emacs]:
+ only define in this case.
+
+Mon May 6 17:37:04 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h (re_search, re_search_2): Changed BUFFER to not be const.
+
+ * regex.c (re_compile_pattern): `^' is in a leading position if
+ it precedes a newline.
+ (various routines): Added or changed header comments.
+ (double_pattern_offsets_list): Changed name from
+ `extend_pattern_offsets_list'.
+ (adjust_pattern_offsets_list): Changed return value from
+ unsigned to void.
+ (verify_and_adjust_endlines): Now returns `true' and `false'
+ instead of 1 and 0.
+ `$' is in a leading position if it follows a newline.
+ (set_bit_to_value, get_bit_value): Exit with error if POSITION < 0
+ so now calling routines don't have to.
+ (init_failure_stack, inspect_failure_stack_top,
+ pop_failure_stack_top, push_pattern_op, double_failure_stack):
+ Now return value unsigned instead of boolean.
+ (re_search, re_search_2): Changed BUFP to not be const.
+ (re_search_2): Added variable const `private_bufp' to send to
+ re_match_2.
+ (push_failure_point): Made return value unsigned instead of boolean.
+
+Sat May 4 15:32:22 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h (re_compile_fastmap): Added extern for this.
+ Changed some comments.
+
+ * regex.c (re_compile_pattern): In case handle_bar: put invalid
+ pattern test before levels matching stuff.
+ Changed some commments.
+ Added optimizing test for detecting an empty alternative that
+ ends with a trailing '$' at the end of the pattern.
+ (re_compile_fastmap): Moved failure_stack stuff to before this
+ so could use it. Made its stack dynamic.
+ Made it return an int so that it could return -2 if its stack
+ couldn't be allocated.
+ Added to header comment (about the return values).
+ (init_failure_stack): Wrote so both re_match_2 and
+ re_compile_fastmap could use it similar stacks.
+ (double_failure_stack): Added for above reasons.
+ (push_pattern_op): Wrote for re_compile_fastmap.
+ (re_search_2): Now return -2 if re_compile_fastmap does.
+ (re_match_2): Made regstart and regend type failure_stack_element*.
+ (push_failure_point): Made pattern_place and string_place type
+ failure_stack_element*.
+ Call double_failure_stack now.
+ Return true instead of 1.
+
+Wed May 1 12:57:21 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.c (remove_intervening_anchors): Avoid erroneously making
+ ops into no_op's by making them no_op only when they're beglines.
+ (verify_and_adjust_endlines): Don't make '$' a normal character
+ if it's before a newline.
+ Look for the endline op in *p, not p[1].
+ (failure_stack_element): Added this declaration.
+ (failure_stack_type): Added this declaration.
+ (INIT_FAILURE_STACK_SIZE, FAILURE_STACK_EMPTY,
+ FAILURE_STACK_PTR_EMPTY, REMAINING_AVAIL_SLOTS): Added for
+ failure stack.
+ (FAILURE_ITEM_SIZE, PUSH_FAILURE_POINT): Deleted.
+ (FREE_VARIABLES): Now free failure_stack.stack instead of stackb.
+ (re_match_2): deleted variables `initial_stack', `stackb',
+ `stackp', and `stacke' and added `failure_stack' to replace them.
+ Replaced calls to PUSH_FAILURE_POINT with those to
+ push_failure_point.
+ (push_failure_point): Added for re_match_2.
+ (pop_failure_point): Rewrote to use a failure_stack_type of stack.
+ (can_match_nothing): Moved definition to below re_match_2.
+ (bcmp_translate): Moved definition to below re_match_2.
+
+Mon Apr 29 14:20:54 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.c (enum regexpcode): Added codes endline_before_newline
+ and repeated_endline_before_newline so could detect these
+ types of endlines in the intermediate stages of a compiled
+ pattern.
+ (INIT_FAILURE_ALLOC): Renamed NFAILURES to this and set it to 5.
+ (BUF_PUSH): Put `do {...} while 0' around this.
+ (BUF_PUSH_2): Defined this to cut down on expansion of EXTEND_BUFFER.
+ (regex_compile): Changed some comments.
+ Now push endline_before_newline if find a `$' before a newline
+ in the pattern.
+ If a `$' might turn into an ordinary character, set laststart
+ to point to it.
+ In '^' case, if syntax bit RE_TIGHT_VBAR is set, then for `^'
+ to be in a leading position, it must be first in the pattern.
+ Don't have to check in one of the else clauses that it's not set.
+ If RE_CONTEXTUAL_INDEP_OPS isn't set but RE_ANCHORS_ONLY_AT_ENDS
+ is, make '^' a normal character if it isn't first in the pattern.
+ Can only detect at the end if a '$' after an alternation op is a
+ trailing one, so can't immediately detect empty alternatives
+ if a '$' follows a vbar.
+ Added a picture of the ``success jumps'' in alternatives.
+ Have to set bufp->used before calling verify_and_adjust_endlines.
+ Also do it before returning all error strings.
+ (remove_intervening_anchors): Now replaces the anchor with
+ repeated_endline_before_newline if it's an endline_before_newline.
+ (verify_and_adjust_endlines): Deleted SYNTAX parameter (could
+ use bufp's) and added GROUP_FORWARD_MATCH_STATUS so could
+ detect back references referring to empty groups.
+ Added variable `bend' to point past the end of the pattern buffer.
+ Added variable `previous_p' so wouldn't have to reinspect the
+ pattern buffer to see what op we just looked at.
+ Added endline_before_newline and repeated_endline_before_newline
+ cases.
+ When checking if in a trailing position, added case where '$'
+ has to be at the pattern's end if either of the syntax bits
+ RE_ANCHORS_ONLY_AT_ENDS or RE_TIGHT_VBAR are set.
+ Since `endline' can have the intermediate form `endline_in_repeat',
+ have to change it to `endline' if RE_REPEATED_ANCHORS_AWAY
+ isn't set.
+ Now disallow empty alternatives with trailing endlines in them
+ if RE_NO_EMPTY_ALTS is set.
+ Now don't make '$' an ordinary character if it precedes a newline.
+ Don't make it an ordinary character if it's before a newline.
+ Back references now affect the level matching something only if
+ they refer to nonempty groups.
+ (can_match_nothing): Now increment p1 in the switch, which
+ changes many of the cases, but makes the code more like what
+ it was derived from.
+ Adjust the return statement to reflect above.
+ (struct register_info): Made `can_match_nothing' field an int
+ instead of a bit so could have -1 in it if never set.
+ (MAX_FAILURE_ITEMS): Changed name from MAX_NUM_FAILURE_ITEMS.
+ (FAILURE_ITEM_SIZE): Defined how much space a failure items uses.
+ (PUSH_FAILURE_POINT): Changed variable `last_used_reg's name
+ to `highest_used_reg'.
+ Added variable `num_stack_items' and changed `len's name to
+ `stack_length'.
+ Test failure stack limit in terms of number of items in it, not
+ in terms of its length. rms' fix tested length against number
+ of items, which was a misunderstanding.
+ Use `realloc' instead of `alloca' to extend the failure stack.
+ Use shifts instead of multiplying by 2.
+ (FREE_VARIABLES): Free `stackb' instead of `initial_stack', as
+ might may have been reallocated.
+ (re_match_2): When mallocing `initial_stack', now multiply
+ the number of items wanted (what was there before) by
+ FAILURE_ITEM_SIZE.
+ (pop_failure_point): Need this procedure form of the macro of
+ the same name for debugging, so left it in and deleted the
+ macro.
+ (recomp): Don't free the pattern buffer's translate field.
+
+Mon Apr 15 09:47:47 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h (RE_DUP_MAX): Moved to outside of #ifdef _POSIX_SOURCE.
+ * regex.c (#include <sys/types.h>): Removed #ifdef _POSIX_SOURCE
+ condition.
+ (malloc, realloc): Made return type void* #ifdef __STDC__.
+ (enum regexpcode): Added endline_in_repeat for the compiler's
+ use; this never ends up on the final compiled pattern.
+ (INIT_PATTERN_OFFSETS_LIST_SIZE): Initial size for
+ pattern_offsets_list_type.
+ (pattern_offset_type): Type for pattern offsets.
+ (pattern_offsets_list_type): Type for keeping a list of
+ pattern offsets.
+ (anchor_list_type): Changed to above type.
+ (PATTERN_OFFSETS_LIST_PTR_FULL): Tests if a pattern offsets
+ list is full.
+ (ANCHOR_LIST_PTR_FULL): Changed to above.
+ (BIT_BLOCK_SIZE): Changed to BITS_BLOCK_SIZE and moved to
+ above bits list routines below regex_compile.
+ (op_list_type): Defined to be pattern_offsets_list_type.
+ (compile_stack_type): Changed offsets to be
+ pattern_offset_type instead of unsigned.
+ (pointer): Changed the name of all structure fields from this
+ to `avail'.
+ (COMPILE_STACK_FULL): Changed so the stack is full if `avail'
+ is equal to `size' instead of `size' - 1.
+ (GET_BUFFER_SPACE): Changed `>=' to `>' in the while statement.
+ (regex_compile): Added variable `enough_memory' so could check
+ that routine that verifies '$' positions could return an
+ allocation error.
+ (group_count): Deleted this variable, as `regnum' already does
+ this work.
+ (op_list): Added this variable to keep track of operations
+ needed for verifying '$' positions.
+ (anchor_list): Now initialize using routine
+ `init_pattern_offsets_list'.
+ Consolidated the three bits_list initializations.
+ In case '$': Instead of trying to go past constructs which can
+ follow '$', merely detect the special case where it has to be
+ at the pattern's end, fix up any fixup jumps if necessary,
+ record the anchor if necessary and add an `endline' (and
+ possibly two `no-op's) to the pattern; will call a routine at
+ the end to verify if it's in a valid position or not.
+ (init_pattern_offsets_list): Added to initialize pattern
+ offsets lists.
+ (extend_anchor_list): Renamed this extend_pattern_offsets_list
+ and renamed parameters and internal variables appropriately.
+ (add_pattern_offset): Added this routine which both
+ record_anchor_position and add_op call.
+ (adjust_pattern_offsets_list): Add this routine to adjust by
+ some increment all the pattern offsets a list of such after a
+ given position.
+ (record_anchor_position): Now send in offset instead of
+ calculating it and just call add_pattern_offset.
+ (adjust_anchor_list): Replaced by above routine.
+ (remove_intervening_anchors): If the anchor is an `endline'
+ then replace it with `endline_in_repeat' instead of `no_op'.
+ (add_op): Added this routine to call in regex_compile
+ wherever push something relevant to verifying '$' positions.
+ (verify_and_adjust_endlines): Added routine to (1) verify that
+ '$'s in a pattern buffer (represented by `endline') were in
+ valid positions and (2) whether or not they were anchors.
+ (BITS_BLOCK_SIZE): Renamed BIT_BLOCK_SIZE and moved to right
+ above bits list routines.
+ (BITS_BLOCK): Defines which array element of a bits list the
+ bit corresponding to a given position is in.
+ (BITS_MASK): Has a 1 where the bit (in a bit list array element)
+ for a given position is.
+
+Mon Apr 1 12:09:06 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.c (BIT_BLOCK_SIZE): Defined this for using with
+ bits_list_type, abstracted from level_list_type so could use
+ for more things than just the level match status.
+ (regex_compile): Renamed `level_list' variable to
+ `level_match_status'.
+ Added variable `group_match_status' of type bits_list_type.
+ Kept track of whether or not for all groups any of them
+ matched other than the empty string, so detect if a back
+ reference in front of a '^' made it nonleading or not.
+ Do this by setting a match status bit for all active groups
+ whenever leave a group that matches other than the empty string.
+ Could detect which groups are active by going through the
+ stack each time, but or-ing a bits list of active groups with
+ a bits list of group match status is faster, so make a bits
+ list of active groups instead.
+ Have to check that '^' isn't in a leading position before
+ going to normal_char.
+ Whenever set level match status of the current level, also set
+ the match status of all active groups.
+ Increase the group count and make that group active whenever
+ open a group.
+ When close a group, only set the next level down if the
+ current level matches other than the empty string, and make
+ the current group inactive.
+ At a back reference, only set a level's match status if the
+ group to which the back reference refers matches other than
+ the empty string.
+ (init_bits_list): Added to initialize a bits list.
+ (get_level_value): Deleted this. (Made into
+ get_level_match_status.)
+ (extend_bits_list): Added to extend a bits list. (Made this
+ from deleted routine `extend_level_list'.)
+ (get_bit): Added to get a bit value from a bits list. (Made
+ this from deleted routine `get_level_value'.)
+ (set_bit_to_value): Added to set a bit in a bits list. (Made
+ this from deleted routine `set_level_value'.)
+ (get_level_match_status): Added this to get the match status
+ of a given level. (Made from get_level_value.)
+ (set_this_level, set_next_lower_level): Made all routines
+ which set bits extend the bits list if necessary, thus they
+ now return an unsigned value to indicate whether or not the
+ reallocation failed.
+ (increase_level): No longer extends the level list.
+ (make_group_active): Added to mark as active a given group in
+ an active groups list.
+ (make_group_inactive): Added to mark as inactive a given group
+ in an active groups list.
+ (set_match_status_of_active_groups): Added to set the match
+ status of all currently active groups.
+ (get_group_match_status): Added to get a given group's match status.
+ (no_levels_match_anything): Removed the paramenter LEVEL.
+ (PUSH_FAILURE_POINT): Added rms' bug fix and changed RE_NREGS
+ to num_internal_regs.
+
+Sun Mar 31 09:04:30 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h (RE_ANCHORS_ONLY_AT_ENDS): Added syntax so could
+ constrain '^' and '$' to only be anchors if at the beginning
+ and end of the pattern.
+ (RE_SYNTAX_POSIX_BASIC): Added the above bit.
+
+ * regex.c (enum regexcode): Changed `unused' to `no_op'.
+ (this_and_lower_levels_match_nothing): Deleted forward reference.
+ (regex_compile): case '^': if the syntax bit RE_ANCHORS_ONLY_AT_ENDS
+ is set, then '^' is only an anchor if at the beginning of the
+ pattern; only record anchor position if the syntax bit
+ RE_REPEATED_ANCHORS_AWAY is set; the '^' is a normal char if
+ the syntax bit RE_ANCHORS_ONLY_AT_END is set and we're not at
+ the beginning of the pattern (and neither RE_CONTEXTUAL_INDEP_OPS
+ nor RE_CONTEXTUAL_INDEP_OPS syntax bits are set).
+ Only adjust the anchor list if the syntax bit
+ RE_REPEATED_ANCHORS_AWAY is set.
+
+ * regex.c (level_list_type): Use to detect when '^' is
+ in a leading position.
+ (regex_compile): Added level_list_type level_list variable in
+ which we keep track of whether or not a grouping level (in its
+ current or most recent incarnation) matches anything besides the
+ empty string. Set the bit for the i-th level when detect it
+ should match something other than the empty string and the bit
+ for the (i-1)-th level when leave the i-th group. Clear all
+ bits for the i-th and higher levels if none of 0--(i - 1)-th's
+ bits are set when encounter an alternation operator on that
+ level. If no levels are set when hit a '^', then it is in a
+ leading position. We keep track of which level we're at by
+ increasing a variable current_level whenever we encounter an
+ open-group operator and decreasing it whenever we encounter a
+ close-group operator.
+ Have to adjust the anchor list contents whenever insert
+ something ahead of them (such as on_failure_jump's) in the
+ pattern.
+ (adjust_anchor_list): Adjusts the offsets in an anchor list by
+ a given increment starting at a given start position.
+ (get_level_value): Returns the bit setting of a given level.
+ (set_level_value): Sets the bit of a given level to a given value.
+ (set_this_level): Sets (to 1) the bit of a given level.
+ (set_next_lower_level): Sets (to 1) the bit of (LEVEL - 1) for a
+ given LEVEL.
+ (clear_this_and_higher_levels): Clears the bits for a given
+ level and any higher levels.
+ (extend_level_list): Adds sizeof(unsigned) more bits to a level list.
+ (increase_level): Increases by 1 the value of a given level variable.
+ (decrease_level): Decreases by 1 the value of a given level variable.
+ (lower_levels_match_nothing): Checks if any levels lower than
+ the given one match anything.
+ (no_levels_match_anything): Checks if any levels match anything.
+ (re_match_2): At case wordbeg: before looking at d-1, check that
+ we're not at the string's beginning.
+ At case wordend: Added some illuminating parentheses.
+
+Mon Mar 25 13:58:51 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h (RE_NO_ANCHOR_AT_NEWLINE): Changed syntax bit name
+ from RE_ANCHOR_NOT_NEWLINE because an anchor never matches the
+ newline itself, just the empty string either before or after it.
+ (RE_REPEATED_ANCHORS_AWAY): Added this syntax bit for ignoring
+ anchors inside groups which are operated on by repetition
+ operators.
+ (RE_DOT_MATCHES_NEWLINE): Added this bit so the match-any-character
+ operator could match a newline when it's set.
+ (RE_SYNTAX_POSIX_BASIC): Set RE_DOT_MATCHES_NEWLINE in this.
+ (RE_SYNTAX_POSIX_EXTENDED): Set RE_DOT_MATCHES_NEWLINE and
+ RE_REPEATED_ANCHORS_AWAY in this.
+ (regerror): Changed prototypes to new POSIX spec.
+
+ * regex.c (anchor_list_type): Added so could null out anchors inside
+ repeated groups.
+ (ANCHOR_LIST_PTR_FULL): Added for above type.
+ (compile_stack_element): Changed name from stack_element.
+ (compile_stack_type): Changed name from compile_stack.
+ (INIT_COMPILE_STACK_SIZE): Changed name from INIT_STACK_SIZE.
+ (COMPILE_STACK_EMPTY): Changed name from STACK_EMPTY.
+ (COMPILE_STACK_FULL): Changed name from STACK_FULL.
+ (regex_compile): Changed SYNTAX parameter to non-const.
+ Changed variable name `stack' to `compile_stack'.
+ If syntax bit RE_REPEATED_ANCHORS_AWAY is set, then naively put
+ anchors in a list when encounter them and then set them to
+ `unused' when detect they are within a group operated on by a
+ repetition operator. Need something more sophisticated than
+ this, as they should only get set to `unused' if they are in
+ positions where they would be anchors. Also need a better way to
+ detect contextually invalid anchors.
+ Changed some commments.
+ (is_in_compile_stack): Changed name from `is_in_stack'.
+ (extend_anchor_list): Added to do anchor stuff.
+ (record_anchor_position): Added to do anchor stuff.
+ (remove_intervening_anchors): Added to do anchor stuff.
+ (re_match_2): Now match a newline with the match-any-character
+ operator if RE_DOT_MATCHES_NEWLINE is set.
+ Compacted some code.
+ (regcomp): Added new POSIX newline information to the header
+ commment.
+ If REG_NEWLINE cflag is set, then now unset RE_DOT_MATCHES_NEWLINE
+ in syntax.
+ (put_in_buffer): Added to do new POSIX regerror spec. Called
+ by regerror.
+ (regerror): Changed to take a pattern buffer, error buffer and
+ its size, and return type `size_t', the size of the full error
+ message, and the first ERRBUF_SIZE - 1 characters of the full
+ error message in the error buffer.
+
+Wed Feb 27 16:38:33 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h (#include <sys/types.h>): Removed this as new POSIX
+ standard has the user include it.
+ (RE_SYNTAX_POSIX_BASIC and RE_SYNTAX_POSIX_EXTENDED): Removed
+ RE_HAT_LISTS_NOT_NEWLINE as new POSIX standard has the cflag
+ REG_NEWLINE now set this. Similarly, added syntax bit
+ RE_ANCHOR_NOT_NEWLINE as this is now unset by REG_NEWLINE.
+ (RE_SYNTAX_POSIX_BASIC): Removed syntax bit
+ RE_NO_CONSECUTIVE_REPEATS as POSIX now allows them.
+
+ * regex.c (#include <sys/types.h>): Added this as new POSIX
+ standard has the user include it instead of us putting it in
+ regex.h.
+ (extern char *re_syntax_table): Made into an extern so the
+ user could allocate it.
+ (DO_RANGE): If don't find a range end, now goto invalid_range_end
+ instead of unmatched_left_bracket.
+ (regex_compile): Made variable SYNTAX non-const.????
+ Reformatted some code.
+ (re_compile_fastmap): Moved is_a_succeed_n's declaration to
+ inner braces.
+ Compacted some code.
+ (SET_NEWLINE_FLAG): Removed and put inline.
+ (regcomp): Made variable `syntax' non-const so can unset
+ RE_ANCHOR_NOT_NEWLINE syntax bit if cflag RE_NEWLINE is set.
+ If cflag RE_NEWLINE is set, set the RE_HAT_LISTS_NOT_NEWLINE
+ syntax bit and unset RE_ANCHOR_NOT_NEWLINE one of `syntax'.
+
+Wed Feb 20 16:33:38 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h (RE_NO_CONSECUTIVE_REPEATS): Changed name from
+ RE_NO_CONSEC_REPEATS.
+ (REG_ENESTING): Deleted this POSIX return value, as the stack
+ is now unbounded.
+ (struct re_pattern_buffer): Changed some comments.
+ (re_compile_pattern): Changed a comment.
+ Deleted check on stack upper bound and corresponding error.
+ Now when there's no interval contents and it's the end of the
+ pattern, go to unmatched_left_curly_brace instead of end_of_pattern.
+ Removed nesting_too_deep error, as the stack is now unbounded.
+ (regcomp): Removed REG_ENESTING case, as the stack is now unbounded.
+ (regerror): Removed REG_ENESTING case, as the stack is now unbounded.
+
+ * regex.c (MAX_STACK_SIZE): Deleted because don't need upper
+ bound on array indexed with an unsigned number.
+
+Sun Feb 17 15:50:24 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h: Changed and added some comments.
+
+ * regex.c (init_syntax_once): Made `_' a word character.
+ (re_compile_pattern): Added a comment.
+ (re_match_2): Redid header comment.
+ (regexec): With header comment about PMATCH, corrected and
+ removed details found regex.h, adding a reference.
+
+Fri Feb 15 09:21:31 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.c (DO_RANGE): Removed argument parentheses.
+ Now get untranslated range start and end characters and set
+ list bits for the translated (if at all) versions of them and
+ all characters between them.
+ (re_match_2): Now use regs->num_regs instead of num_regs_wanted
+ wherever possible.
+ (regcomp): Now build case-fold translate table using isupper
+ and tolower facilities so will work on foreign language characters.
+
+Sat Feb 9 16:40:03 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h (RE_HAT_LISTS_NOT_NEWLINE): Changed syntax bit name
+ from RE_LISTS_NOT_NEWLINE as it only affects nonmatching lists.
+ Changed all references to the match-beginning-of-string
+ operator to match-beginning-of-line operator, as this is what
+ it does.
+ (RE_NO_CONSEC_REPEATS): Added this syntax bit.
+ (RE_SYNTAX_POSIX_BASIC): Added above bit to this.
+ (REG_PREMATURE_END): Changed name to REG_EEND.
+ (REG_EXCESS_NESTING): Changed name to REG_ENESTING.
+ (REG_TOO_BIG): Changed name to REG_ESIZE.
+ (REG_INVALID_PREV_RE): Deleted this return POSIX value.
+ Added and changed some comments.
+
+ * regex.c (re_compile_pattern): Now sets the pattern buffer's
+ `return_default_num_regs' field.
+ (typedef struct stack_element, stack_type, INIT_STACK_SIZE,
+ MAX_STACK_SIZE, STACK_EMPTY, STACK_FULL): Added for regex_compile.
+ (INIT_BUF_SIZE): Changed value from 28 to 32.
+ (BUF_PUSH): Changed name from BUFPUSH.
+ (MAX_BUF_SIZE): Added so could use in many places.
+ (IS_CHAR_CLASS_STRING): Replaced is_char_class with this.
+ (regex_compile): Added a stack which could grow dynamically
+ and which has struct elements.
+ Go back to initializing `zero_times_ok' and `many_time_ok' to
+ 0 and |=ing them inside the loop.
+ Now disallow consecutive repetition operators if the syntax
+ bit RE_NO_CONSEC_REPEATS is set.
+ Now detect trailing backslash when the compiler is expecting a
+ `?' or a `+'.
+ Changed calls to GET_BUFFER_SPACE which asked for 6 to ask for
+ 3, as that's all they needed.
+ Now check for trailing backslash inside lists.
+ Now disallow an empty alternative right before an end-of-line
+ operator.
+ Now get buffer space before leaving space for a fixup jump.
+ Now check if at pattern end when at open-interval operator.
+ Added some comments.
+ Now check if non-interval repetition operators follow an
+ interval one if the syntax bit RE_NO_CONSEC_REPEATS is set.
+ Now only check if what precedes an interval repetition
+ operator isn't a regular expression which matches one
+ character if the syntax bit RE_NO_CONSEC_REPEATS is set.
+ Now return "Unmatched [ or [^" instead of "Unmatched [".
+ (is_in_stack): Added to check if a given register number is in
+ the stack.
+ (re_match_2): If initial variable allocations fail, return -2,
+ instead of -1.
+ Now set reg's `num_regs' field when allocating regs.
+ Now before allocating them, free regs->start and end if they
+ aren't NULL and return -2 if either allocation fails.
+ Now use regs->num_regs instead of num_regs_wanted to control
+ regs loops.
+ Now increment past the newline when matching it with an
+ end-of-line operator.
+ (recomp): Added to the header comment.
+ Now return REG_ESUBREG if regex_compile returns "Unmatched [
+ or [^" instead of doing so if it returns "Unmatched [".
+ Now return REG_BADRPT if in addition to returning "Missing
+ preceding regular expression", regex_compile returns "Invalid
+ preceding regular expression".
+ Now return new return value names (see regex.h changes).
+ (regexec): Added to header comment.
+ Initialize regs structure.
+ Now match whole string.
+ Now always free regs.start and regs.end instead of just when
+ the string matched.
+ (regerror): Now return "Regex error: Unmatched [ or [^.\n"
+ instead of "Regex error: Unmatched [.\n".
+ Now return "Regex error: Preceding regular expression either
+ missing or not simple.\n" instead of "Regex error: Missing
+ preceding regular expression.\n".
+ Removed REG_INVALID_PREV_RE case (it got subsumed into the
+ REG_BADRPT case).
+
+Thu Jan 17 09:52:35 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h: Changed a comment.
+
+ * regex.c: Changed and added large header comments.
+ (re_compile_pattern): Now if detect that `laststart' for an
+ interval points to a byte code for a regular expression which
+ matches more than one character, make it an internal error.
+ (regerror): Return error message, don't print it.
+
+Tue Jan 15 15:32:49 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h (regcomp return codes): Added GNU ones.
+ Updated some comments.
+
+ * regex.c (DO_RANGE): Changed `obscure_syntax' to `syntax'.
+ (regex_compile): Added `following_left_brace' to keep track of
+ where pseudo interval following a valid interval starts.
+ Changed some instances that returned "Invalid regular
+ expression" to instead return error strings coinciding with
+ POSIX error codes.
+ Changed some comments.
+ Now consider only things between `[:' and `:]' to be possible
+ character class names.
+ Now a character class expression can't end a pattern; at
+ least a `]' must close the list.
+ Now if the syntax bit RE_NO_BK_CURLY_BRACES is set, then a
+ valid interval must be followed by yet another to get an error
+ for preceding an interval (in this case, the second one) with
+ a regular expression that matches more than one character.
+ Now if what follows a valid interval begins with a open
+ interval operator but doesn't begin a valid interval, then set
+ following_left_bracket to it, put it in C and go to
+ normal_char label.
+ Added some comments.
+ Return "Invalid character class name" instead of "Invalid
+ character class".
+ (regerror): Return messages for all POSIX error codes except
+ REG_ECOLLATE and REG_NEWLINE, along with all GNU error codes.
+ Added `break's after all cases.
+ (main): Call re_set_syntax instead of setting `obscure_syntax'
+ directly.
+
+Sat Jan 12 13:37:59 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h (Copyright): Updated date.
+ (#include <sys/types.h>): Include unconditionally.
+ (RE_CANNOT_MATCH_NEWLINE): Deleted this syntax bit.
+ (RE_SYNTAX_POSIX_BASIC, RE_SYNTAX_POSIX_EXTENDED): Removed
+ setting the RE_ANCHOR_NOT_NEWLINE syntax bit from these.
+ Changed and added some comments.
+ (struct re_pattern_buffer): Changed some flags from chars to bits.
+ Added field `syntax'; holds which syntax pattern was compiled with.
+ Added bit flag `return_default_num_regs'.
+ (externs for GNU and Berkeley UNIX routines): Added `const's to
+ parameter types to be compatible with POSIX.
+ (#define const): Added to support old C compilers.
+
+ * regex.c (Copyright): Updated date.
+ (enum regexpcode): Deleted `newline'.
+ (regex_compile): Renamed re_compile_pattern to this, added a
+ syntax parameter so it can set the pattern buffer's `syntax'
+ field.
+ Made `pattern', and `size' `const's so could pass to POSIX
+ interface routines; also made `const' whatever interval
+ variables had to be to make this work.
+ Changed references to `obscure_syntax' to new parameter `syntax'.
+ Deleted putting `newline' in buffer when see `\n'.
+ Consider invalid character classes which have nothing wrong
+ except the character class name; if so, return character-class error.
+ (is_char_class): Added routine for regex_compile.
+ (re_compile_pattern): added a new one which calls
+ regex_compile with `obscure_syntax' as the actual parameter
+ for the formal `syntax'.
+ Gave this the old routine's header comments.
+ Made `pattern', and `size' `const's so could use POSIX interface
+ routine parameters.
+ (re_search, re_search_2, re_match, re_match_2): Changed
+ `pbufp' to `bufp'.
+ (re_search_2, re_match_2): Changed `mstop' to `stop'.
+ (re_search, re_search_2): Made all parameters except `regs'
+ `const's so could use POSIX interface routines parameters.
+ (re_search_2): Added private copies of `const' parameters so
+ could change their values.
+ (re_match_2): Made all parameters except `regs' `const's so
+ could use POSIX interface routines parameters.
+ Changed `size1' and `size2' parameters to `size1_arg' and
+ `size2_arg' and so could change; added local `size1' and
+ `size2' and set to these.
+ Added some comments.
+ Deleted `newline' case.
+ `begline' can also possibly match if `d' contains a newline;
+ if it does, we have to increment d to point past the newline.
+ Replaced references to `obscure_syntax' with `bufp->syntax'.
+ (re_comp, re_exec): Made parameter `s' a `const' so could use POSIX
+ interface routines parameters.
+ Now call regex_compile, passing `obscure_syntax' via the
+ `syntax' parameter.
+ (re_exec): Made local `len' a `const' so could pass to re_search.
+ (regcomp): Added header comment.
+ Added local `syntax' to set and pass to regex_compile rather
+ than setting global `obscure_syntax' and passing it.
+ Call regex_compile with its `syntax' parameter rather than
+ re_compile_pattern.
+ Return REG_ECTYPE if character-class error.
+ (regexec): Don't initialize `regs' to anything.
+ Made `private_preg' a nonpointer so could set to what the
+ constant `preg' points.
+ Initialize `private_preg's `return_default_num_regs' field to
+ zero because want to return `nmatch' registers, not however
+ many there are subexpressions in the pattern.
+ Also test if `nmatch' > 0 to see if should pass re_match `regs'.
+
+Tue Jan 8 15:57:17 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h (struct re_pattern_buffer): Reworded comment.
+
+ * regex.c (EXTEND_BUFFER): Also reset beg_interval.
+ (re_search_2): Return val if val = -2.
+ (NUM_REG_ITEMS): Listed items in comment.
+ (NUM_OTHER_ITEMS): Defined this for using in > 1 definition.
+ (MAX_NUM_FAILURE_ITEMS): Replaced `+ 2' with NUM_OTHER_ITEMS.
+ (NUM_FAILURE_ITEMS): As with definition above and added to
+ comment.
+ (PUSH_FAILURE_POINT): Replaced `* 2's with `<< 1's.
+ (re_match_2): Test with equality with 1 to see pbufp->bol and
+ pbufp->eol are set.
+
+Fri Jan 4 15:07:22 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h (struct re_pattern_buffer): Reordered some fields.
+ Updated some comments.
+ Added not_bol and not_eol fields.
+ (extern regcomp, regexec, regerror): Added return types.
+ (extern regfree): Added `extern'.
+
+ * regex.c (min): Deleted unused macro.
+ (re_match_2): Compacted some code.
+ Removed call to macro `min' from `for' loop.
+ Fixed so unused registers get filled with -1's.
+ Fail if the pattern buffer's `not_bol' field is set and
+ encounter a `begline'.
+ Fail if the pattern buffer's `not_eol' field is set and
+ encounter a `endline'.
+ Deleted redundant check for empty stack in fail case.
+ Don't free pattern buffer's components in re_comp.
+ (regexec): Initialize variable regs.
+ Added `private_preg' pattern buffer so could set `not_bol' and
+ `not_eol' fields and hand to re_match.
+ Deleted naive attempt to detect anchors.
+ Set private pattern buffer's `not_bol' and `not_eol' fields
+ according to eflags value.
+ `nmatch' must also be > 0 for us to bother allocating
+ registers to send to re_match and filling pmatch
+ with their results after the call to re_match.
+ Send private pattern buffer instead of argument to re_match.
+ If use the registers, always free them and then set them to NULL.
+ (regerror): Added this Posix routine.
+ (regfree): Added this Posix routine.
+
+Tue Jan 1 15:02:45 1991 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h (RE_NREGS): Deleted this definition, as now the user
+ can choose how many registers to have.
+ (REG_NOTBOL, REG_NOTEOL): Defined these Posix eflag bits.
+ (REG_NOMATCH, REG_BADPAT, REG_ECOLLATE, REG_ECTYPE,
+ REG_EESCAPE, REG_ESUBREG, REG_EBRACK, REG_EPAREN, REG_EBRACE,
+ REG_BADBR, REG_ERANGE, REG_ESPACE, REG_BADRPT, REG_ENEWLINE):
+ Defined these return values for Posix's regcomp and regexec.
+ Updated some comments.
+ (struct re_pattern_buffer): Now typedef this as regex_t
+ instead of the other way around.
+ (struct re_registers): Added num_regs field. Made start and
+ end fields pointers to char instead of fixed size arrays.
+ (regmatch_t): Added this Posix register type.
+ (regcomp, regexec, regerror, regfree): Added externs for these
+ Posix routines.
+
+ * regex.c (enum boolean): Typedefed this.
+ (re_pattern_buffer): Reformatted some comments.
+ (re_compile_pattern): Updated some comments.
+ Always push start_memory and its attendant number whenever
+ encounter a group, not just when its number is less than the
+ previous maximum number of registers; same for stop_memory.
+ Get 4 bytes of buffer space instead of 2 when pushing a
+ set_number_at.
+ (can_match_nothing): Added this to elaborate on and replace
+ code in re_match_2.
+ (reg_info_type): Made can_match_nothing field a bit instead of int.
+ (MIN): Added for re_match_2.
+ (re_match_2 macros): Changed all `for' loops which used
+ RE_NREGS to now use num_internal_regs as upper bounds.
+ (MAX_NUM_FAILURE_ITEMS): Use num_internal_regs instead of RE_NREGS.
+ (POP_FAILURE_POINT): Added check for empty stack.
+ (FREE_VARIABLES): Added this to free (and set to NULL)
+ variables allocated in re_match_2.
+ (re_match_2): Rearranged parameters to be in order.
+ Added variables num_regs_wanted (how many registers the user wants)
+ and num_internal_regs (how many groups there are).
+ Allocated initial_stack, regstart, regend, old_regstart,
+ old_regend, reginfo, best_regstart, and best_regend---all
+ which used to be fixed size arrays. Free them all and return
+ -1 if any fail.
+ Free above variables if starting position pos isn't valid.
+ Changed all `for' loops which used RE_NREGS to now use
+ num_internal_regs as upper bounds---except for the loops which
+ fill regs; then use num_regs_wanted.
+ Allocate regs if the user has passed it and wants more than 0
+ registers filled.
+ Set regs->start[i] and regs->end[i] to -1 if either
+ regstart[i] or regend[i] equals -1, not just the first.
+ Free allocated variables before returning.
+ Updated some comments.
+ (regcomp): Return REG_ESPACE, REG_BADPAT, REG_EPAREN when
+ appropriate.
+ Free translate array.
+ (regexec): Added this Posix interface routine.
+
+Mon Dec 24 14:21:13 1990 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h: If _POSIX_SOURCE is defined then #include <sys/types.h>.
+ Added syntax bit RE_CANNOT_MATCH_NEWLINE.
+ Defined Posix cflags: REG_EXTENDED, REG_NEWLINE, REG_ICASE, and
+ REG_NOSUB.
+ Added fields re_nsub and no_sub to struct re_pattern_buffer.
+ Typedefed regex_t to be `struct re_pattern_buffer'.
+
+ * regex.c (CHAR_SET_SIZE): Defined this to be 256 and replaced
+ incidences of this value with this constant.
+ (re_compile_pattern): Added switch case for `\n' and put
+ `newline' into the pattern buffer when encounter this.
+ Increment the pattern_buffer's `re_nsub' field whenever open a
+ group.
+ (re_match_2): Match a newline with `newline'---provided the
+ syntax bit RE_CANNOT_MATCH_NEWLINE isn't set.
+ (regcomp): Added this Posix interface routine.
+ (enum test_type): Added interface_test tag.
+ (main): Added Posix interface test.
+
+Tue Dec 18 12:58:12 1990 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h (struct re_pattern_buffer): reformatted so would fit
+ in texinfo documentation.
+
+Thu Nov 29 15:49:16 1990 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h (RE_NO_EMPTY_ALTS): Added this bit.
+ (RE_SYNTAX_POSIX_EXTENDED): Added above bit.
+
+ * regex.c (re_compile_pattern): Disallow empty alternatives only
+ when RE_NO_EMPTY_ALTS is set, not when RE_CONTEXTUAL_INVALID_OPS is.
+ Changed RE_NO_BK_CURLY_BRACES to RE_NO_BK_PARENS when testing
+ for empty groups at label handle_open.
+ At label handle_bar: disallow empty alternatives if RE_NO_EMPTY_ALTS
+ is set.
+ Rewrote some comments.
+
+ (re_compile_fastmap): cleaned up code.
+
+ (re_search_2): Rewrote comment.
+
+ (struct register_info): Added field `inner_groups'; it records
+ which groups are inside of the current one.
+ Added field can_match_nothing; it's set if the current group
+ can match nothing.
+ Added field ever_match_something; it's set if current group
+ ever matched something.
+
+ (INNER_GROUPS): Added macro to access inner_groups field of
+ struct register_info.
+
+ (CAN_MATCH_NOTHING): Added macro to access can_match_nothing
+ field of struct register_info.
+
+ (EVER_MATCHED_SOMETHING): Added macro to access
+ ever_matched_something field of struct register_info.
+
+ (NOTE_INNER_GROUP): Defined macro to record that a given group
+ is inside of all currently active groups.
+
+ (re_match_2): Added variables *p1 and mcnt2 (multipurpose).
+ Added old_regstart and old_regend arrays to hold previous
+ register values if they need be restored.
+ Initialize added fields and variables.
+ case start_memory: Find out if the group can match nothing.
+ Save previous register values in old_restart and old_regend.
+ Record that current group is inside of all currently active
+ groups.
+ If the group is inside a loop and it ever matched anything,
+ restore its registers to values before the last failed match.
+ Restore the registers for the inner groups, too.
+ case duplicate: Can back reference to a group that never
+ matched if it can match nothing.
+
+Thu Nov 29 11:12:54 1990 Karl Berry (karl at hayley)
+
+ * regex.c (bcopy, ...): define these if either _POSIX_SOURCE or
+ STDC_HEADERS is defined; same for including <stdlib.h>.
+
+Sat Oct 6 16:04:55 1990 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h (struct re_pattern_buffer): Changed field comments.
+
+ * regex.c (re_compile_pattern): Allow a `$' to precede an
+ alternation operator (`|' or `\|').
+ Disallow `^' and/or `$' in empty groups if the syntax bit
+ RE_NO_EMPTY_GROUPS is set.
+ Wait until have parsed a valid `\{...\}' interval expression
+ before testing RE_CONTEXTUAL_INVALID_OPS to see if it's
+ invalidated by that.
+ Don't use RE_NO_BK_CURLY_BRACES to test whether or not a validly
+ parsed interval expression is invalid if it has no preceding re;
+ rather, use RE_CONTEXTUAL_INVALID_OPS.
+ If an interval parses, but there is no preceding regular
+ expression, yet the syntax bit RE_CONTEXTUAL_INDEP_OPS is set,
+ then that interval can match the empty regular expression; if
+ the bit isn't set, then the characters in the interval
+ expression are parsed as themselves (sans the backslashes).
+ In unfetch_interval case: Moved PATFETCH to above the test for
+ RE_NO_BK_CURLY_BRACES being set, which would force a goto
+ normal_backslash; the code at both normal_backsl and normal_char
+ expect a character in `c.'
+
+Sun Sep 30 11:13:48 1990 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h: Changed some comments to use the terms used in the
+ documentation.
+ (RE_CONTEXTUAL_INDEP_OPS): Changed name from `RE_CONTEXT_INDEP_OPS'.
+ (RE_LISTS_NOT_NEWLINE): Changed name from `RE_HAT_NOT_NEWLINE.'
+ (RE_ANCHOR_NOT_NEWLINE): Added this syntax bit.
+ (RE_NO_EMPTY_GROUPS): Added this syntax bit.
+ (RE_NO_HYPHEN_RANGE_END): Deleted this syntax bit.
+ (RE_SYNTAX_...): Reformatted.
+ (RE_SYNTAX_POSIX_BASIC, RE_SYNTAX_EXTENDED): Added syntax bits
+ RE_ANCHOR_NOT_NEWLINE and RE_NO_EMPTY_GROUPS, and deleted
+ RE_NO_HYPHEN_RANGE_END.
+ (RE_SYNTAX_POSIX_EXTENDED): Added syntax bit RE_DOT_NOT_NULL.
+
+ * regex.c (bcopy, bcmp, bzero): Define if _POSIX_SOURCE is defined.
+ (_POSIX_SOURCE): ifdef this, #include <stdlib.h>
+ (#ifdef emacs): Changed comment of the #endif for the its #else
+ clause to be `not emacs', not `emacs.'
+ (no_pop_jump): Changed name from `jump'.
+ (pop_failure_jump): Changed name from `finalize_jump.'
+ (maybe_pop_failure_jump): Changed name from `maybe_finalize_jump'.
+ (no_pop_jump_n): Changed name from `jump_n.'
+ (EXTEND_BUFFER): Use shift instead of multiplication to double
+ buf->allocated.
+ (DO_RANGE, recompile_pattern): Added macro to set the list bits
+ for a range.
+ (re_compile_pattern): Fixed grammar problems in some comments.
+ Checked that RE_NO_BK_VBAR is set to make `$' valid before a `|'
+ and not set to make it valid before a `\|'.
+ Checked that RE_NO_BK_PARENS is set to make `$' valid before a ')'
+ and not set to make it valid before a `\)'.
+ Disallow ranges starting with `-', unless the range is the
+ first item in a list, rather than disallowing ranges which end
+ with `-'.
+ Disallow empty groups if the syntax bit RE_NO_EMPTY_GROUPS is set.
+ Disallow nothing preceding `{' and `\{' if they represent the
+ open-interval operator and RE_CONTEXTUAL_INVALID_OPS is set.
+ (register_info_type): typedef-ed this using `struct register_info.'
+ (SET_REGS_MATCHED): Compacted the code.
+ (re_match_2): Made it fail if back reference a group which we've
+ never matched.
+ Made `^' not match a newline if the syntax bit
+ RE_ANCHOR_NOT_NEWLINE is set.
+ (really_fail): Added this label so could force a final fail that
+ would not try to use the failure stack to recover.
+
+Sat Aug 25 14:23:01 1990 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h (RE_CONTEXTUAL_OPS): Changed name from RE_CONTEXT_OPS.
+ (global): Rewrote comments and rebroke some syntax #define lines.
+
+ * regex.c (isgraph): Added definition for sequents.
+ (global): Now refer to character set lists as ``lists.''
+ Rewrote comments containing ``\('' or ``\)'' to now refer to
+ ``groups.''
+ (RE_CONTEXTUAL_OPS): Changed name from RE_CONTEXT_OPS.
+
+ (re_compile_pattern): Expanded header comment.
+
+Sun Jul 15 14:50:25 1990 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h (RE_CONTEX_INDEP_OPS): the comment's sense got turned
+ around when we changed how it read; changed it to be correct.
+
+Sat Jul 14 16:38:06 1990 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h (RE_NO_EMPTY_BK_REF): changed name to
+ RE_NO_MISSING_BK_REF, as this describes it better.
+
+ * regex.c (re_compile_pattern): changed RE_NO_EMPTY_BK_REF
+ to RE_NO_MISSING_BK_REF, as above.
+
+Thu Jul 12 11:45:05 1990 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h (RE_NO_EMPTY_BRACKETS): removed this syntax bit, as
+ bracket expressions should *never* be empty regardless of the
+ syntax. Removes this bit from RE_SYNTAX_POSIX_BASIC and
+ RE_SYNTAX_POSIX_EXTENDED.
+
+ * regex.c (SET_LIST_BIT): in the comment, now refer to character
+ sets as (non)matching sets, as bracket expressions can now match
+ other things in addition to characters.
+ (re_compile_pattern): refer to groups as such instead of `\(...\)'
+ or somesuch, because groups can now be enclosed in either plain
+ parens or backslashed ones, depending on the syntax.
+ In the '[' case, added a boolean just_had_a_char_class to detect
+ whether or not a character class begins a range (which is invalid).
+ Restore way of breaking out of a bracket expression to original way.
+ Add way to detect a range if the last thing in a bracket
+ expression was a character class.
+ Took out check for c != ']' at the end of a character class in
+ the else clause, as it had already been checked in the if part
+ that also checked the validity of the string.
+ Set or clear just_had_a_char_class as appropriate.
+ Added some comments. Changed references to character sets to
+ ``(non)matching lists.''
+
+Sun Jul 1 12:11:29 1990 Karl Berry (karl at hayley)
+
+ * regex.h (BYTEWIDTH): moved back to regex.c.
+
+ * regex.h (re_compile_fastmap): removed declaration; this
+ shouldn't be advertised.
+
+Mon May 28 15:27:53 1990 Kathy Hargreaves (kathy at hayley)
+
+ * regex.c (ifndef Sword): Made comments more specific.
+ (global): include <stdio.h> so can write fatal messages on
+ standard error. Replaced calls to assert with fprintfs to
+ stderr and exit (1)'s.
+ (PREFETCH): Reformatted to make more readable.
+ (AT_STRINGS_BEG): Defined to test if we're at the beginning of
+ the virtual concatenation of string1 and string2.
+ (AT_STRINGS_END): Defined to test if at the end of the virtual
+ concatenation of string1 and string2.
+ (AT_WORD_BOUNDARY): Defined to test if are at a word boundary.
+ (IS_A_LETTER(d)): Defined to test if the contents of the pointer D
+ is a letter.
+ (re_match_2): Rewrote the wordbound, notwordbound, wordbeg, wordend,
+ begbuf, and endbuf cases in terms of the above four new macros.
+ Called SET_REGS_MATCHED in the matchsyntax, matchnotsyntax,
+ wordchar, and notwordchar cases.
+
+Mon May 14 14:49:13 1990 Kathy Hargreaves (kathy at hayley)
+
+ * regex.c (re_search_2): Fixed RANGE to not ever take STARTPOS
+ outside of virtual concatenation of STRING1 and STRING2.
+ Updated header comment as to this.
+ (re_match_2): Clarified comment about MSTOP in header.
+
+Sat May 12 15:39:00 1990 Kathy Hargreaves (kathy at hayley)
+
+ * regex.c (re_search_2): Checked for out-of-range STARTPOS.
+ Added comments.
+ When searching backwards, not only get the character with which
+ to compare to the fastmap from string2 if the starting position
+ >= size1, but also if size1 is zero; this is so won't get a
+ segmentation fault if string1 is null.
+ Reformatted code at label advance.
+
+Thu Apr 12 20:26:21 1990 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h: Added #pragma once and #ifdef...endif __REGEXP_LIBRARY.
+ (RE_EXACTN_VALUE): Added for search.c to use.
+ Reworded some comments.
+
+ regex.c: Punctuated some comments correctly.
+ (NULL): Removed this.
+ (RE_EXACTN_VALUE): Added for search.c to use.
+ (<ctype.h>): Moved this include to top of file.
+ (<assert.h>): Added this include.
+ (struct regexpcode): Assigned 0 to unused and 1 to exactn
+ because of RE_EXACTN_VALUE.
+ Added comment.
+ (various macros): Lined up backslashes near end of line.
+ (insert_jump): Cleaned up the header comment.
+ (re_search): Corrected the header comment.
+ (re_search_2): Cleaned up and completed the header comment.
+ (re_max_failures): Updated comment.
+ (struct register_info): Constructed as bits so as to save space
+ on the stack when pushing register information.
+ (IS_ACTIVE): Macro for struct register_info.
+ (MATCHED_SOMETHING): Macro for struct register_info.
+ (NUM_REG_ITEMS): How many register information items for each
+ register we have to push on the stack at each failure.
+ (MAX_NUM_FAILURE_ITEMS): If push all the registers on failure,
+ this is how many items we push on the stack.
+ (PUSH_FAILURE_POINT): Now pushes whether or not the register is
+ currently active, and whether or not it matched something.
+ Checks that there's enough space allocated to accomodate all the
+ items we currently want to push. (Before, a test for an empty
+ stack sufficed because we always pushed and popped the same
+ number of items).
+ Replaced ``2'' with MAX_NUM_FAILURE_POINTS when ``2'' refers
+ to how many things get pushed on the stack each time.
+ When copy the stack into the newly allocated storage, now only copy
+ the area in use.
+ Clarified comment.
+ (POP_FAILURE_POINT): Defined to use in places where put number
+ of registers on the stack into a variable before using it to
+ decrement the stack, so as to not confuse the compiler.
+ (IS_IN_FIRST_STRING): Defined to check if a pointer points into
+ the first string.
+ (SET_REGS_MATCHED): Changed to use the struct register_info
+ bits; also set the matched-something bit to false if the
+ register isn't currently active. (This is a redundant setting.)
+ (re_match_2): Cleaned up and completed the header comment.
+ Updated the failure stack comment.
+ Replaced the ``2'' with MAX_NUM_FAILURE_ITEMS in the static
+ allocation of initial_stack, because now more than two (now up
+ to MAX_FAILURE_ITEMS) items get pushed on the failure stack each
+ time.
+ Ditto for stackb.
+ Trashed restart_seg1, regend_seg1, best_regstart_seg1, and
+ best_regend_seg1 because they could have erroneous information
+ in them, such as when matching ``a'' (in string1) and ``ab'' (in
+ string2) with ``(a)*ab''; before using IS_IN_FIRST_STRING to see
+ whether or not the register starts or ends in string1,
+ regstart[1] pointed past the end of string1, yet regstart_seg1
+ was 0!
+ Added variable reg_info of type struct register_info to keep
+ track of currently active registers and whether or not they
+ currently match anything.
+ Commented best_regs_set.
+ Trashed reg_active and reg_matched_something and put the
+ information they held into reg_info; saves space on the stack.
+ Replaced NULL with '\000'.
+ In begline case, compacted the code.
+ Used assert to exit if had an internal error.
+ In begbuf case, because now force the string we're working on
+ into string2 if there aren't two strings, now allow d == string2
+ if there is no string1 (and the check for that is size1 == 0!);
+ also now succeeds if there aren't any strings at all.
+ (main, ifdef canned): Put test type into a variable so could
+ change it while debugging.
+
+Sat Mar 24 12:24:13 1990 Kathy Hargreaves (kathy at hayley)
+
+ * regex.c (GET_UNSIGNED_NUMBER): Deleted references to num_fetches.
+ (re_compile_pattern): Deleted num_fetches because could keep
+ track of the number of fetches done by saving a pointer into the
+ pattern.
+ Added variable beg_interval to be used as a pointer, as above.
+ Assert that beg_interval points to something when it's used as above.
+ Initialize succeed_n's to lower_bound because re_compile_fastmap
+ needs to know it.
+ (re_compile_fastmap): Deleted unnecessary variable is_a_jump_n.
+ Added comment.
+ (re_match_2): Put number of registers on the stack into a
+ variable before using it to decrement the stack, so as to not
+ confuse the compiler.
+ Updated comments.
+ Used error routine instead of printf and exit.
+ In exactn case, restored longer code from ``original'' regex.c
+ which doesn't test translate inside a loop.
+
+ * regex.h: Moved #define NULL and the enum regexpcode definition
+ and to regex.c. Changed some comments.
+
+ regex.c (global): Updated comments about compiling and for the
+ re_compile_pattern jump routines.
+ Added #define NULL and the enum regexpcode definition (from
+ regex.h).
+ (enum regexpcode): Added set_number_at to reset the n's of
+ succeed_n's and jump_n's.
+ (re_set_syntax): Updated its comment.
+ (re_compile_pattern): Moved its heading comment to after its macros.
+ Moved its include statement to the top of the file.
+ Commented or added to comments of its macros.
+ In start_memory case: Push laststart value before adding
+ start_memory and its register number to the buffer, as they
+ might not get added.
+ Added code to put a set_number_at before each succeed_n and one
+ after each jump_n; rewrote code in what seemed a more
+ straightforward manner to put all these things in the pattern so
+ the succeed_n's would correctly jump to the set_number_at's of
+ the matching jump_n's, and so the jump_n's would correctly jump
+ to after the set_number_at's of the matching succeed_n's.
+ Initialize succeed_n n's to -1.
+ (insert_op_2): Added this to insert an operation followed by
+ two integers.
+ (re_compile_fastmap): Added set_number_at case.
+ (re_match_2): Moved heading comment to after macros.
+ Added mention of REGS to heading comment.
+ No longer turn a succeed_n with n = 0 into an on_failure_jump,
+ because n needs to be reset each time through a loop.
+ Check to see if a succeed_n's n is set by its set_number_at.
+ Added set_number_at case.
+ Updated some comments.
+ (main): Added another main to run posix tests, which is compiled
+ ifdef both test and canned. (Old main is still compiled ifdef
+ test only).
+
+Tue Mar 19 09:22:55 1990 Kathy Hargreaves (kathy at hayley)
+
+ * regex.[hc]: Change all instances of the word ``legal'' to
+ ``valid'' and all instances of ``illegal'' to ``invalid.''
+
+Sun Mar 4 12:11:31 1990 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h: Added syntax bit RE_NO_EMPTY_RANGES which is set if
+ an ending range point has to collate higher or equal to the
+ starting range point.
+ Added syntax bit RE_NO_HYPHEN_RANGE_END which is set if a hyphen
+ can't be an ending range point.
+ Set to two above bits in RE_SYNTAX_POSIX_BASIC and
+ RE_SYNTAX_POSIX_EXTENDED.
+
+ regex.c: (re_compile_pattern): Don't allow empty ranges if the
+ RE_NO_EMPTY_RANGES syntax bit is set.
+ Don't let a hyphen be a range end if the RE_NO_HYPHEN_RANGE_END
+ syntax bit is set.
+ (ESTACK_PUSH_2): renamed this PUSH_FAILURE_POINT and made it
+ push all the used registers on the stack, as well as the number
+ of the highest numbered register used, and (as before) the two
+ failure points.
+ (re_match_2): Fixed up comments.
+ Added arrays best_regstart[], best_regstart_seg1[], best_regend[],
+ and best_regend_seg1[] to keep track of the best match so far
+ whenever reach the end of the pattern but not the end of the
+ string, and there are still failure points on the stack with
+ which to backtrack; if so, do the saving and force a fail.
+ If reach the end of the pattern but not the end of the string,
+ but there are no more failure points to try, restore the best
+ match so far, set the registers and return.
+ Compacted some code.
+ In stop_memory case, if the subexpression we've just left is in
+ a loop, push onto the stack the loop's on_failure_jump failure
+ point along with the current pointer into the string (d).
+ In finalize_jump case, in addition to popping the failure
+ points, pop the saved registers.
+ In the fail case, restore the registers, as well as the failure
+ points.
+
+Sun Feb 18 15:08:10 1990 Kathy Hargreaves (kathy at hayley)
+
+ * regex.c: (global): Defined a macro GET_BUFFER_SPACE which
+ makes sure you have a specified number of buffer bytes
+ allocated.
+ Redefined the macro BUFPUSH to use this.
+ Added comments.
+
+ (re_compile_pattern): Call GET_BUFFER_SPACE before storing or
+ inserting any jumps.
+
+ (re_match_2): Set d to string1 + pos and dend to end_match_1
+ only if string1 isn't null.
+ Force exit from a loop if it's around empty parentheses.
+ In stop_memory case, if found some jumps, increment p2 before
+ extracting address to which to jump. Also, don't need to know
+ how many more times can jump_n.
+ In begline case, d must equal string1 or string2, in that order,
+ only if they are not null.
+ In maybe_finalize_jump case, skip over start_memorys' and
+ stop_memorys' register numbers, too.
+
+Thu Feb 15 15:53:55 1990 Kathy Hargreaves (kathy at hayley)
+
+ * regex.c (BUFPUSH): off by one goof in deciding whether to
+ EXTEND_BUFFER.
+
+Wed Jan 24 17:07:46 1990 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h: Moved definition of NULL to here.
+ Got rid of ``In other words...'' comment.
+ Added to some comments.
+
+ regex.c: (re_compile_pattern): Tried to bulletproof some code,
+ i.e., checked if backward references (e.g., p[-1]) were within
+ the range of pattern.
+
+ (re_compile_fastmap): Fixed a bug in succeed_n part where was
+ getting the amount to jump instead of how many times to jump.
+
+ (re_search_2): Changed the name of the variable ``total'' to
+ ``total_size.''
+ Condensed some code.
+
+ (re_match_2): Moved the comment about duplicate from above the
+ start_memory case to above duplicate case.
+
+ (global): Rewrote some comments.
+ Added commandline arguments to testing.
+
+Wed Jan 17 11:47:27 1990 Kathy Hargreaves (kathy at hayley)
+
+ * regex.c: (global): Defined a macro STORE_NUMBER which stores a
+ number into two contiguous bytes. Also defined STORE_NUMBER_AND_INCR
+ which does the same thing and then increments the pointer to the
+ storage place to point after the number.
+ Defined a macro EXTRACT_NUMBER which extracts a number from two
+ continguous bytes. Also defined EXTRACT_NUMBER_AND_INCR which
+ does the same thing and then increments the pointer to the
+ source to point to after where the number was.
+
+Tue Jan 16 12:09:19 1990 Kathy Hargreaves (kathy at hayley)
+
+ * regex.h: Incorporated rms' changes.
+ Defined RE_NO_BK_REFS syntax bit which is set when want to
+ interpret back reference patterns as literals.
+ Defined RE_NO_EMPTY_BRACKETS syntax bit which is set when want
+ empty bracket expressions to be illegal.
+ Defined RE_CONTEXTUAL_ILLEGAL_OPS syntax bit which is set when want
+ it to be illegal for *, +, ? and { to be first in an re or come
+ immediately after a | or a (, and for ^ not to appear in a
+ nonleading position and $ in a nontrailing position (outside of
+ bracket expressions, that is).
+ Defined RE_LIMITED_OPS syntax bit which is set when want +, ?
+ and | to always be literals instead of ops.
+ Fixed up the Posix syntax.
+ Changed the syntax bit comments from saying, e.g., ``0 means...''
+ to ``If this bit is set, it means...''.
+ Changed the syntax bit defines to use shifts instead of integers.
+
+ * regex.c: (global): Incorporated rms' changes.
+
+ (re_compile_pattern): Incorporated rms' changes
+ Made it illegal for a $ to appear anywhere but inside a bracket
+ expression or at the end of an re when RE_CONTEXTUAL_ILLEGAL_OPS
+ is set. Made the same hold for $ except it has to be at the
+ beginning of an re instead of the end.
+ Made the re "[]" illegal if RE_NO_EMPTY_BRACKETS is set.
+ Made it illegal for | to be first or last in an re, or immediately
+ follow another | or a (.
+ Added and embellished some comments.
+ Allowed \{ to be interpreted as a literal if RE_NO_BK_CURLY_BRACES
+ is set.
+ Made it illegal for *, +, ?, and { to appear first in an re, or
+ immediately follow a | or a ( when RE_CONTEXTUAL_ILLEGAL_OPS is set.
+ Made back references interpreted as literals if RE_NO_BK_REFS is set.
+ Made recursive intervals either illegal (if RE_NO_BK_CURLY_BRACES
+ isn't set) or interpreted as literals (if is set), if RE_INTERVALS
+ is set.
+ Made it treat +, ? and | as literals if RE_LIMITED_OPS is set.
+ Cleaned up some code.
+
+Thu Dec 21 15:31:32 1989 Kathy Hargreaves (kathy at hayley)
+
+ * regex.c: (global): Moved RE_DUP_MAX to regex.h and made it
+ equal 2^15 - 1 instead of 1000.
+ Defined NULL to be zero.
+ Moved the definition of BYTEWIDTH to regex.h.
+ Made the global variable obscure_syntax nonstatic so the tests in
+ another file could use it.
+
+ (re_compile_pattern): Defined a maximum length (CHAR_CLASS_MAX_LENGTH)
+ for character class strings (i.e., what's between the [: and the
+ :]'s).
+ Defined a macro SET_LIST_BIT(c) which sets the bit for C in a
+ character set list.
+ Took out comments that EXTEND_BUFFER clobbers C.
+ Made the string "^" match itself, if not RE_CONTEXT_IND_OPS.
+ Added character classes to bracket expressions.
+ Change the laststart pointer saved with the start of each
+ subexpression to point to start_memory instead of after the
+ following register number. This is because the subexpression
+ might be in a loop.
+ Added comments and compacted some code.
+ Made intervals only work if preceded by an re matching a single
+ character or a subexpression.
+ Made back references to nonexistent subexpressions illegal if
+ using POSIX syntax.
+ Made intervals work on the last preceding character of a
+ concatenation of characters, e.g., ab{0,} matches abbb, not abab.
+ Moved macro PREFETCH to outside the routine.
+
+ (re_compile_fastmap): Added succeed_n to work analogously to
+ on_failure_jump if n is zero and jump_n to work analogously to
+ the other backward jumps.
+
+ (re_match_2): Defined macro SET_REGS_MATCHED to set which
+ current subexpressions had matches within them.
+ Changed some comments.
+ Added reg_active and reg_matched_something arrays to keep track
+ of in which subexpressions currently have matched something.
+ Defined MATCHING_IN_FIRST_STRING and replaced ``dend == end_match_1''
+ with it to make code easier to understand.
+ Fixed so can apply * and intervals to arbitrarily nested
+ subexpressions. (Lots of previous bugs here.)
+ Changed so won't match a newline if syntax bit RE_DOT_NOT_NULL is set.
+ Made the upcase array nonstatic so the testing file could use it also.
+
+ (main.c): Moved the tests out to another file.
+
+ (tests.c): Moved all the testing stuff here.
+
+Sat Nov 18 19:30:30 1989 Kathy Hargreaves (kathy at hayley)
+
+ * regex.c: (re_compile_pattern): Defined RE_DUP_MAX, the maximum
+ number of times an interval can match a pattern.
+ Added macro GET_UNSIGNED_NUMBER (used to get below):
+ Added variables lower_bound and upper_bound for upper and lower
+ bounds of intervals.
+ Added variable num_fetches so intervals could do backtracking.
+ Added code to handle '{' and "\{" and intervals.
+ Added to comments.
+
+ (store_jump_n): (Added) Stores a jump with a number following the
+ relative address (for intervals).
+
+ (insert_jump_n): (Added) Inserts a jump_n.
+
+ (re_match_2): Defined a macro ESTACK_PUSH_2 for the error stack;
+ it checks for overflow and reallocates if necessary.
+
+ * regex.h: Added bits (RE_INTERVALS and RE_NO_BK_CURLY_BRACES)
+ to obscure syntax to indicate whether or not
+ a syntax handles intervals and recognizes either \{ and
+ \} or { and } as operators. Also added two syntaxes
+ RE_SYNTAX_POSIX_BASIC and RE_POSIX_EXTENDED and two command codes
+ to the enumeration regexpcode; they are succeed_n and jump_n.
+
+Sat Nov 18 19:30:30 1989 Kathy Hargreaves (kathy at hayley)
+
+ * regex.c: (re_compile_pattern): Defined INIT_BUFF_SIZE to get rid
+ of repeated constants in code. Tested with value 1.
+ Renamed PATPUSH as BUFPUSH, since it pushes things onto the
+ buffer, not the pattern. Also made this macro extend the buffer
+ if it's full (so could do the following):
+ Took out code at top of loop that checks to see if buffer is going
+ to be full after 10 additions (and reallocates if necessary).
+
+ (insert_jump): Rearranged declaration lines so comments would read
+ better.
+
+ (re_match_2): Compacted exactn code and added more comments.
+
+ (main): Defined macros TEST_MATCH and MATCH_SELF to do
+ testing; took out loop so could use these instead.
+
+Tue Oct 24 20:57:18 1989 Kathy Hargreaves (kathy at hayley)
+
+ * regex.c (re_set_syntax): Gave argument `syntax' a type.
+ (store_jump, insert_jump): made them void functions.
+
+Local Variables:
+mode: indented-text
+left-margin: 8
+version-control: never
+End: