summaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
...
* dfa: remove some useless castsJim Meyering2011-06-212-5/+5
| | | | | | | | | | * src/dfa.c (icatalloc): Change type of "old" parameter from "char const *" to "char *". Don't cast-away const on realloc argument. Remove now-unnecessary const-discarding cast. Don't (void)-cast strcpy result. * src/dosbuf.c (undossify_input): Remove anachronistic cast-to-"char *" of realloc argument.
* dfa: more heap-allocation-related overflow protectionJim Meyering2011-06-211-6/+2
| | | | | | | | | | * src/dfa.c (enlist): Use xnrealloc, not realloc. Also, remove unnecessary cast-to-(char *). (dfamust): Use xnmalloc, not malloc. Before, this code would return upon malloc failure (xnmalloc exits upon failure), but later, via the *ALLOC macros, it could already exit, so this new potential exit point is nothing new. The same applies to enlist, since it is called only through dfamust.
* maint: tighten up superfluous codeJim Meyering2011-06-191-4/+1
| | | | | * src/main.c (parse_grep_colors): Use xstrdup in place of xmalloc, a useless test, strlen, and strcpy.
* dfa: avoid possibility of overflowPaul Eggert2011-06-193-15/+14
| | | | | | | * src/dfa.c (REALLOC_IF_NECESSARY, CALLOC, MALLOC, REALLOC): Use functions from xalloc.h to avoid overflow. * src/dfasearch.c (GEAcompile): Use xnrealloc rather than realloc. * src/pcresearch.c (Pcompile): Use xnmalloc, not xmalloc.
* dfa: correct two uses of btowcJim Meyering2011-06-181-2/+2
| | | | | | | * src/dfa.c (setbit_c, setbit_case_fold_c): Compare the btowc return value against WEOF, not EOF. Suggested by Eli Zaretskii. On a system like MinGW with unsigned wint_t, comparing a btowc return value against EOF (-1) would always be false.
* dfa: don't overrun a malloc'd buffer for certain regexpsJim Meyering2011-06-171-1/+1
| | | | | | | | | | | * src/dfa.c (dfaanalyze): Allocate space for twice as many positions as there are leaves. Before this change, for some regular expressions, DFA analysis would have inserted far more "positions" than dfa->nleaves (up to double). Reported by Raymond Russell in http://savannah.gnu.org/bugs/?33547 * tests/dfa-heap-overrun: Trigger the overrun. * tests/Makefile.am (TESTS): Add it. * NEWS (Bug fixes): Mention it.
* dfa: optimize wide characters in a bracket expressionPaolo Bonzini2011-06-071-5/+32
| | | | | * src/dfa.c (addtok): Compile characters to an alternation. Handle the case when nothing else remains in the MBCSET.
* dfa: refactor to prepare for upcoming optimizationsPaolo Bonzini2011-06-071-9/+21
| | | | | * src/dfa.c (parse_bracket_exp): Move optimization of MBCSET from here... (addtok): ... to here.
* dfa: correct handling of single-byte character rangesPaolo Bonzini2011-06-071-49/+55
| | | | | | | | | | | | This provides a better fix for the unibyte-bracket-expr and high-bit-range testcases, and fixes the latent bug tested by bogus-wctob. * src/dfa.c (setbit_case_fold): Remove, replace with... (setbit_wc, setbit_c, setbit_case_fold_c): ... these. (parse_bracket_exp): Use setbit_case_fold_c when iterating over single-byte sequences. Use setbit_wc for multi-byte character sets, and setbit_case_fold_c for single-byte character sets. (lex): Use setbit_case_fold_c for single-byte character sets.
* fix the [...] bug also for relatively unusual uni-byte encodingsJim Meyering2011-06-071-3/+7
| | | | | | | | * src/dfa.c (setbit_case_fold): Also handle uni-byte locales like the one mentioned in the original report: see 2011-05-07 commit d98338eb. Re-reported by Santiago Ruano Rincón. Note that most uni-byte locales are not affected. * NEWS (Bug fixes): Mention it.
* grep -P: don't abort upon exceeding PCRE's backtracking limitJim Meyering2011-05-211-0/+4
| | | | | | | | | * src/pcresearch.c (Pexecute): Handle PCRE_ERROR_MATCHLIMIT. * tests/Makefile.am (XFAIL_TESTS): Remove pcre-abort. * tests/pcre-abort: Expect failure, no output, and increase the length of the input string, in case the backtracking limit is ever raised. Adjust comment. * NEWS (Bug fixes): Mention it.
* maint: remove syntax-checking sc_tight_scope ruleJim Meyering2011-05-131-40/+0
| | | | | | * src/Makefile.am (sc_tight_scope): Remove rule. Now it's provided via gnulib's maint.mk. * cfg.mk (sc_tight_scope): Likewise.
* maint: use consistent declaration syntaxJim Meyering2011-05-081-2/+3
| | | | | * src/grep.h (matchers): Declare consistently, so the sc_tight_scope rule detects this as an extern-marked variable.
* fix a bug whereby echo c|grep '[c]' would fail for any c in 0x80..0xffJim Meyering2011-05-071-1/+2
| | | | | | * src/dfa.c (setbit_case_fold) [MBS_SUPPORT]: Set the bit also when wctob returns EOF. * NEWS (Bug fixes): Mention it.
* build: move add_utf8_anychar into MBS ifdefArnold D. Robbins2011-05-021-1/+1
|
* maint: remove GAWK ifndef; no longer neededArnold D. Robbins2011-05-021-2/+0
|
* maint: add the tight_scope syntax-checking ruleJim Meyering2011-04-281-0/+40
| | | | | | | This ensures that the only externally scoped symbols are ones that are explicitly marked as "extern" or white-listed like "main". * src/Makefile.am (sc_tight_scope): New rule, copied from coreutils. * cfg.mk (sc_tight_scope): Define, to hook to it from the top level.
* maint: mark some function declarations as externJim Meyering2011-04-281-10/+9
| | | | * src/search.h: Add "extern" keyword to each function declaration.
* maint: fix doubled-word typos in commentsJim Meyering2011-04-231-2/+2
| | | | | * src/dfa.c (SUCCEEDS_IN_CONTEXT): Remove doubled "a". * src/dfa.c (BACKREF): s/it it/it is/
* maint: fix typos in comments: s/can not/cannot/Jim Meyering2011-04-091-2/+2
| | | | * src/dfa.c (check_matching_with_multibyte_ops, dfastate): As above.
* maint: remove unneeded #include directivesJim Meyering2011-01-261-1/+0
| | | | | * lib/savedir.c: Don't include <stddef.h>. Not needed. * src/dfa.c: Likewise.
* maint: update copyright year ranges to include 2011Jim Meyering2011-01-0315-15/+15
| | | | Run "make update-copyright", so "make syntax-check" works in 2011.
* main: fix exit status on xmalloc failuresPaolo Bonzini2010-12-201-0/+1
| | | | | * NEWS: Update. * src/main.c (main): Set exit_failure. Reported by Guy Shaw.
* grep: add include guardsPaolo Bonzini2010-11-142-0/+9
| | | | | * src/system.h: Add multiple inclusion guards. * src/grep.h: Likewise.
* dfa: process range expressions consistently with system regexPaolo Bonzini2010-09-231-11/+16
| | | | | | | | | | | | | | | The actual meaning of range expressions in glibc is not exactly strcoll, which makes the behavior of grep hard to predict when compiled with the system regex. Leave to the system regex matcher the decision of which single-byte characters are matched by a range expression. This partially reverts a change made in commit 0d38a8bb (which made sense at the time, but not now that src/dfa.c is not doing multibyte character set matching anymore). * src/dfa.c (in_coll_range): Remove. (parse_bracket_exp): Use system regex to find which single-char bytes match a range expression.
* build: fix link error on systems that have libiconv but not libintlBruno Haible2010-09-231-1/+3
| | | | * src/Makefile.am (LDADD): Add $(LIBICONV).
* build: avoid compilation failure on the HurdJim Meyering2010-09-211-4/+4
| | | | | | * src/dfasearch.c (dfawarn): Rename enum symbols to use DW_ prefix, so as not to collide with "GNU", which is defined by the Hurd. Reported by Matthias Lanzinger in http://savannah.gnu.org/bugs/?31096
* dfa: fix compilation when not using MBSAharon Robbins2010-09-201-2/+2
| | | | | * src/dfa.c (prepare_wc_buf) [!MBS_SUPPORT]: Do not compile this function.
* dfa: fall back to glibc matcher if a MBCSET is foundPaolo Bonzini2010-09-141-0/+13
| | | | | | | | | | | | | | This patch enables full support of equivalence classes and multicharacter collation symbols. It can also improve performance problems in some cases for multibyte grep. Both of these changes however depend on the glibc version installed in the system. For UTF-8 it will trigger only in the presence of MBCSET, e.g. [a-z]. For other character sets all brackets and `.` as well will trigger it. * NEWS: Document this. * src/dfa.c (dfaexec): Fall back to glibc for multibyte matches, if possible.
* dfa: reduce stack usagePatrick Boyd2010-09-081-2/+7
| | | | | | | | * src/dfa.c (dfaanalyze): Allocate GRPS and LABELS arrays from heap, not on the stack. With this change, grep can now run in these UEFI simulators: http://sourceforge.net/apps/mediawiki/tianocore/index.php?title=EDK http://sourceforge.net/apps/mediawiki/tianocore/index.php?title=EDK2
* grep: diagnose and exit-2 for bogus REs like [:space:], [:digit:], etc.Jim Meyering2010-09-013-66/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When I make a mistake like this: grep '[:lower:]' ... be it in a script or on the command line, I want to know about it as soon as possible. I don't want grep to print a mere warning that it is interpreting this suspicious and almost guaranteed-wrong regular expression as a set of just 6 bytes. And I certainly don't want grep to silently do the wrong thing, even if that would be officially standards-conforming. It's obvious that I intended [[:lower:]], and I want my error to be diagnosed in a way that is most likely to get my attention. Thus, with this change, grep now prints a diagnostic and exits with status 2 the moment it encounters an offending [:char_class:] construct. This changes the way grep works by default, rather than putting this new behavior on an option. A new option would seldom be used in scripts (not portable), and would probably be used only rarely by those who need it the most. This new functionality provides a valuable safety measure and incurs truly negligible risk. For strict POSIX compliance, set POSIXLY_CORRECT in your environment. That disables this new feature. Revert the changes from commit 2cd3bcea, "grep: add --warnings={always,never,auto}.", and then do the following: * src/dfasearch.c (dfawarn): Call getenv("POSIXLY_CORRECT") here; Remove "warning: " from the diagnostic, now that it's more than a warning, and exit with status 2. * NEWS (New features): Describe the new semantics. * tests/warn-char-classes: Adjust one test to accommodate this change. * doc/grep.texi (Character Classes and Bracket Expressions): Document. (Environment Variables): Cross-reference it. Remove reference to obsolete getopt illegal vs. invalid difference. Thanks to Paul Eggert for suggestions and an initial prod.
* maint: use gnulib's standard --version-printing codeJim Meyering2010-08-301-9/+9
| | | | | | | | This includes author names and keeps the copyright year up to date. * bootstrap.conf (gnulib_modules): Add propername and version-etc-fsf. * src/main.c (AUTHORS): Define. (main): Use version_etc, rather than hard-coding the copyright text. Prompted by a patch from Paolo Bonzini.
* dfa: warn on [:space:] and similarPaolo Bonzini2010-08-273-0/+29
| | | | | | | | * src/dfa.c (parse_bracket_exp): Warn on regular expressions such as [:space:]. * src/dfa.h (dfawarn): New prototype. * src/dfasearch.c (dfawarn): New. * NEWS: Document.
* grep: add --warnings={always,never,auto}.Paolo Bonzini2010-08-272-10/+59
| | | | | | | * src/grep.h (no_warnings): New declaration. * src/main.c (no_warnings): New. (WARNINGS_OPTION): Add to enum. (main): Add --warnings. Handle color_option == 2 together with it.
* search: fix "grep -Fif /dev/null"Paolo Bonzini2010-08-272-5/+7
| | | | | | * bootstrap.conf: Include gnulib module minmax. * src/searchutils.c (mbtolower): Handle *N == 0 case. * src/system.h: Include minmax.h from gnulib.
* Remove declaration after statement in dfa.cAdam Katz2010-08-271-1/+2
| | | | * dfa.c (dfaexec): Declare saved_end at the beginning of the function.
* make --include=FILE work once againJim Meyering2010-08-131-4/+4
| | | | | | | | | | The semantics of excluded_file_name changed (when operating on an "included" file name list). * src/main.c (main): Adjust for changed semantics of excluded_file_name simply by removing a negation. * NEWS (Bug fixes): Mention this fix. * tests/include-exclude: Add a test for this. Reported by Joe Perches in http://savannah.gnu.org/bugs/?29876.
* maint: don't emit an extra newline in each of two diagnosticsJim Meyering2010-05-261-2/+2
| | | | | * src/main.c (context_length_arg, grepdir): Remove a stray \n in each of two diagnostics.
* search: Avoid out-of-bounds access.Bruno Haible2010-05-241-1/+1
| | | | | * src/dfasearch.c (EGexecute): Avoid access beyond end of buffer that could happen if start != beg - buf.
* dfa: fix signedness warningsAharon Robbins2010-05-231-2/+2
| | | | * src/dfa.c (dfaexec): Cast p when passing it to prepare_wc_buf.
* dfa: speed up [[:digit:]] and [[:xdigit:]]Paolo Bonzini2010-05-061-27/+29
| | | | | | | | | | | | | | There's no "multibyte pain" in these two classes, since POSIX and ISO C99 mandate their contents. Time for "./grep -x '[[:digit:]]' /usr/share/dict/linux.words" Before: 1.5s, after: 0.07s. (sed manages only 0.5s). * src/dfa.c (predicates): Declare struct dfa_ctype separately from definition. Add sb_only. (find_pred): Return const struct dfa_ctype *. (parse_bracket_exp): Return const struct dfa_ctype *. Do not fill MBCSET for sb_only character types.
* dfa: avoid segfault when processing an invalid multi-byte sequenceJim Meyering2010-05-051-0/+2
| | | | | | * src/dfa.c (dfaexec): Handle the cases in which mbrtowc returns (size_t)-1 or (size_t)-2, rather than setting mblen_buf[i] to an outrageously large value.
* grep: remove redundant syntax bitPaolo Bonzini2010-05-051-4/+1
| | | | * grep.c (Gcompile): Remove RE_HAT_LISTS_NOT_NEWLINE.
* dfa: convert to wide character line-by-linePaolo Bonzini2010-05-051-39/+58
| | | | | | | | | | | This provides a nice speedup for -m in general, but especially it avoids quadratic complexity in case we have to go to glibc. * NEWS: Document change. * src/dfa.c (prepare_wc_buf): Extract out of dfaexec. Convert only up to the next newline. (dfaexec): Exit multibyte processing loop if past buf_end. Call prepare_wc_buf again after processing a newline.
* maint: remove useless #if HAVE_STDLIB_HJim Meyering2010-05-011-2/+0
| | | | * src/mbsupport.h: Don't test HAVE_STDLIB_H.
* dfa: don't #ifdef-out member declarationsJim Meyering2010-04-201-4/+0
| | | | | | | * src/dfa.c (struct dfa): Remove "#if MBS_SUPPORT" guard that made several member declarations conditional on this cpp definition. (token): Likewise. Reported by Anders Wallin.
* dfa: honor RE_DOT_NEWLINE and RE_DOT_NOT_NULL in UTF-8 period optimizationPaolo Bonzini2010-04-201-1/+12
| | | | | * src/dfa.c (add_utf8_anychar): Check for RE_DOT_NEWLINE and RE_DOT_NOT_NULL.
* grep: fix --mmap not being ignoredPaolo Bonzini2010-04-201-0/+1
| | | | | * NEWS: Document bugfix. * main.c (main): Ignore MMAP_OPTION.
* maint: avoid syntax-check failure due to indentation via TABsJim Meyering2010-04-191-9/+9
| | | | * src/dfa.c (atom): Expand TABs in indentation.
* maint: restrict scope of two globals to dfasearch.cJim Meyering2010-04-191-2/+2
| | | | | * src/dfasearch.c (patterns, pcount): Declare these file-scoped globals to be static.