| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Problem reported by Philippe Schnoebelen (Bug#34078).
* doc/grep.in.1: Add missing “more”.
|
|
|
|
| |
* doc/grep.in.1: Mention [:blank:] (Bug#33291).
|
|
|
|
|
|
|
|
| |
Run "make update-copyright" and then...
* gnulib: Update to latest with copyright year adjusted.
* tests/init.sh: Sync with gnulib to pick up copyright year.
* bootstrap: Likewise.
* doc/grep.in.1: Use "-" in copyright year ranges, not \en.
|
|
|
|
|
| |
* tests/mb-non-UTF8-perf-Fw: Run twice, to avoid first-read penalty.
Reported by Nelson H.F. Beebe.
|
| |
|
|
|
|
| |
* cfg.mk (sc_prohibit_backref): New rule.
|
|
|
|
| |
* AUTHORS: Remove URL that’s too long.
|
|
|
|
| |
* AUTHORS: Update to better reflect current authorship.
|
|
|
|
| |
* cfg.mk (old_NEWS_hash): Updating old news, we must also udpate this.
|
|
|
|
|
|
|
|
| |
* doc/grep.texi (Usage): Remove palindrome question. Bondioni’s
RE makes grep issue a ‘grep: stack overflow’ diagnostic, and we
shouldn’t be encouraging fancy back-references anyway, due to all
the bugs in this area (Bug#26864). Plus, the allusion to
“GNU extensions” doesn't seem to be correct here.
|
|
|
|
|
| |
Prompted by suggestions by Stephane Chazelas (Bug#38792#20).
* doc/grep.texi (Usage): Make examples more robust.
|
| |
|
| |
|
|
|
|
|
|
| |
Inspired by Bug#26864.
* doc/grep.texi (Known Bugs): New section.
Mention back-reference issues.
|
|
|
|
|
| |
Suggested by Stephane Chazelas (Bug#38792).
* doc/grep.in.1, doc/grep.texi: Add ‘--’ to recently-added example.
|
|
|
|
| |
* doc/grep.in.1: Rename "Matcher Selection" to "Pattern Syntax".
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Problem reported by Martin Simons (Bug#38792).
* doc/grep.texi: Fix quoting used in examples. Say that patterns
should be quoted, use quoting more consistently in examples, and
give an example illustrating the difference between patterns and
globbing. Don’t assume zgrep expertise in example.
* doc/grep.in.1: Likewise. Also, reorder sections
to match GNU/Linux man-pages style.
|
|
|
|
| |
* NEWS: Minor wording change.
|
|
|
|
|
| |
* gnulib: update
* tests/init.sh: Sync from gnulib (this removes the LC_ALL=C setting).
|
|
|
|
|
|
|
| |
* tests/grep-dev-null-out: Use a 10-second timeout, rather than
a 1-second one. This avoids false failure on slow systems.
Reported by Assaf Gordon in
https://lists.gnu.org/r/grep-devel/2019-12/msg00018.html
|
| |
|
|
|
|
|
|
| |
* tests/surrogate-pair: Adjust to match fixed behavior
on AIX 7.2, where wchar_t is 16 bits and cannot represent
the test case data.
|
|
|
|
|
|
| |
* tests/backslash-s-vs-invalid-multitype: Rename to...
* tests/backslash-s-vs-invalid-multibyte: ...this.
* tests/Makefile.am (TESTS): Reflect renaming.
|
|
|
|
| |
* cfg.mk (sc_timeout_prereq): New syntax-check rule.
|
|
|
|
|
|
|
|
| |
* tests/mb-non-UTF8-perf-Fw: This test uses "timeout",
so must first call require_timeout_.
This avoids test spurious failure when running with
no timeout program. Reported by Bruno Haible in
https://lists.gnu.org/r/grep-devel/2019-12/msg00008.html
|
|
|
|
|
|
|
|
|
|
| |
AIX 7.2 /bin/sh’s printf command mishandles octal escapes
in multibyte locales: it treats them as characters, not bytes.
* tests/backslash-s-vs-invalid-multitype, tests/encoding-error:
Use the C locale when employing the printf command with an octal
escape that AIX 7.2 sh might mishandle.
* tests/init.sh (setup_): Use the C locale for tests.
This has the side benefit of making them more reproducible.
|
|
|
|
|
| |
* src/dfasearch.c (possible_backrefs_in_pattern): Remove a
duplicate "a", insert a "be" and a comma, and reformat.
|
|
|
|
|
|
| |
* gnulib: Update submodule to latest.
* bootstrap: Copy from gnulib.
* tests/init.sh: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes some bugs in the previous commit,
and should finish the fix for Bug#33249.
* NEWS: Mention fix for Bug#33249.
* src/dfasearch.c (possible_backrefs_in_pattern, regex_compile)
(GEAcompile): In new code, prefer ptrdiff_t to size_t when either
will do, since ptrdiff_t has better error checking. At some point
we should adjust the old code too.
(possible_backrefs_in_pattern): Rename from
find_backref_in_pattern. New arg BS_SAFE. All uses changed.
Fix false negative if a multibyte character ends in a single
'\\' byte, followed by the two bytes '\\', '1'.
(regex_compile): Simplify.
(GEAcompile): Avoid quadratic behavior when reallocating growing
buffers. Fix a couple of bugs in copying pattern data involving
backreferences. Fix another bug in copying pattern metadata
involving backreferences, by removing the need to copy it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When grep uses regex, it splits a pattern with multiple lines by
newline character into fragments. Compilation and execution run for
each fragment. That causes slowdown. By this change, each fragment is
divided into groups by whether the fragment includes back references.
A fragment with back references constitutes group, and all fragments
that lack back references also constitute a group.
This change extremely speeds-up following case.
$ seq -f '%040g' 0 9999 | sed '1s/$/\\(0\\)\\1/' >pat
$ yes 00000000000000000000000000000000000000000x | head -10000 >in
$ time -p env LC_ALL=C src/grep -f pat in
* src/dfasearch.c (find_backref_in_pattern, regex_compile):
New functions.
(GEAcompile): Use the new functions to group fragments
as mentioned above.
|
|
|
|
| |
* NEWS: Mention Bug#34951.
|
|
|
|
|
|
|
|
|
|
| |
DFAMUST() must be called after parse and before tokens re-order which is
introduced in commit 5c7a0371823876cca7a1347fa09ca26bbbff0c98, but both are
executed in compilation phase.
* lib/dfa.c (dfaparse): Change it to global function.
(dfacomp): If first argument is NULL, skip parse.
* lib/dfa.h: (dfaparse): Add a prototype.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
grep uses its KWset matcher for multiple word matching, but that is
very slow when most of the parts matched to a pattern are not words.
So, if the first match to a pattern is not a word, use the grep matcher
to match for its line.
Note that when START_PTR is set, the grep matcher uses the regex matcher
which is very slow to match words. Therefore, we use the grep matcher
when only START_PTR is NULL.
* src/kwsearch.c (Fexecute): If an initial match is incomplete because
not on a word boundary, use the grep matcher to find a matching line.
|
|
|
|
|
| |
* tests/Makefile.am (TESTS): Alphabetize the new addition,
mb-non-UTF8-perf-Fw to placate syntax-check's sc_sorted_tests.
|
|
|
|
| |
* po/POTFILES.in: Remove lib/xstrtol-error.c.
|
|
|
|
|
|
|
|
| |
Update Gnulib to latest. Also:
* src/dfasearch.c (EGexecute): Use ptrdiff_t, not size_t,
to match new Gnulib API.
* tests/Makefile.am (TESTS): Add dfa-invalid-utf8.
* tests/dfa-invalid-utf8: New file.
|
|
|
|
|
|
| |
* tests/mb-non-UTF8-perf-Fw: New file. Detect v3.3-22-g090a4db's
performance regression.
* tests/Makefile.am (TESTS): Add it.
|
|
|
|
|
| |
* tests/mb-non-UTF8-word-boundary: Also correct "introduced-in"
version number in a comment here.
|
|
|
|
|
| |
* NEWS (Bug fixes): Correction: the -Fw bug was introduced
in 2.28, not in 3.0. Reported by Paul Eggert.
|
|
|
|
|
|
|
| |
* src/searchutils.c (mb_goback): New parameter. All callers changed.
* src/search.h (mb_goback): Update prototype.
* src/kwsearch.c (Fexecute): Use mb_goback's MBCLEN to detect a
word-boundary even more efficiently.
|
|
|
|
|
| |
* src/kwsearch.c (Fexecute): Avoid unnecessary back-up in non-UTF8
multibyte locales.
|
|
|
|
| |
* src/kwsearch.c (Fexecute): Change misleading name: s/bol/nl/
|
| |
|
|
|
|
| |
* src/kwsearch.c (Fexecute): Logic was reversed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For example, this command would erroneously print its input line:
echo ab | LC_CTYPE=ja_JP.eucjp grep -Fw b
This arose when the "memrchr" search for a preceding newline failed:
in that case, MB_START was not adjusted and was initially the same
as BEG, so wordchar_prev mistakenly returned 0.
* src/kwsearch.c (Fexecute): Set MB_START also when there is no
preceding newline.
* NEWS (Bug fixes): Mention it.
* tests/mb-non-UTF8-word-boundary: New file. Test for the bug.
* tests/Makefile.am (TESTS): Add it.
Reported by NIDE, Naoyuki in https://bugs.gnu.org/38223.
|
|
|
|
| |
* po/POTFILES.in: Add lib/argmatch.h.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Suggested by Karl Berry and mostly implemented by Arnold Robbins
(Bug#37907).
* NEWS:
* doc/grep.in.1:
* doc/grep.texi (Matching Control):
* src/grep.c (usage):
Document the new option.
* src/grep.c (NO_IGNORE_CASE_OPTION): New constant.
(long_options, main): Support new option.
|
|
|
|
|
| |
* src/grep.c (main): Use an int rather than an enum for a local
var, which is overkill here.
|