| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PCRE2 has a bug when using PCRE2_MATCH_INVALID_UTF: it would
sometimes fail to match patterns using negative classes
like \W and \D.
* NEWS (Bug fixes): Mention it.
* src/pcre2search.c: Restrict impact of the bug.
Do not use the problematic flag with broken versions of PCRE2.
Also, generate locale tables only for single-byte locales,
as the PCRE2 documentation recommends this.
* tests/Makefile.am (TESTS): Add the file name
* tests/pcre-utf8-bug224: New file, to test for this.
|
|
|
|
|
| |
* src/pcresearch.c (Pcompile): Ignore failure returns
from pcre2_jit_compile.
|
|
|
|
|
|
|
|
|
|
| |
* src/grep.c: No need to include pcre2.h.
(main) [HAVE_LIBPCRE]: Call Pprint_version instead of
doing it ourselves.
* src/pcresearch.c (Pprint_version): New function.
It also checks belatedly for buffer overflow, and
says "grep -P uses PCRE2" instead of "Built with PCRE".
* tests/version-pcre: Adjust test to match.
|
|
|
|
|
|
|
|
|
|
| |
PCRE is integral to the functioning of grep's -P option, so it is in our
interest to make it easy to see which version of PCRE grep uses.
* src/grep.c [HAVE_LIBPCRE]: Include <pcre2.h>.
[HAVE_LIBPCRE] (main): Print pcre version info.
* tests/version-pcre: New test for this.
* tests/Makefile.am (TESTS): Add the file name.
* NEWS (Changes in behavior): Mention it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our prepass-based fixes for the -P \d bug have caused repeated
further bugs. Avoid the need for a prepass, by using PCRE2_UCP
only if PCRE2_EXTRA_ASCII_BSD is also supported. Since the -P \w
bug was present from grep 2.5 through 3.8 it’s OK if we wait a
little longer to fix it.
* NEWS: Mention this.
* src/pcresearch.c (pcre_pattern_expand_backslash_d}: Remove.
Remove its use.
(Pcompile): Use PCRE2_UCP only if PCRE2_EXTRA_ASCII_BSD.
* tests/pcre-ascii-digits, tests/pcre-utf8-w:
Skip tests on older PCRE2 implementations.
|
|
|
|
|
|
|
|
|
|
|
| |
* NEWS: Mention \D, too.
* doc/grep.texi: Likewise
* src/pcresearch.c (pcre_pattern_expand_backslash_d): Handle \D.
Also, ifdef-out this new function and its call site when not needed.
* tests/pcre-ascii-digits: Test \D, too.
Tighten one test by using returns_ 1.
Add comments and tests that work only with 10.43 and newer.
Paul Eggert raised the issue of \D in https://bugs.gnu.org/62267#8
|
|
|
|
|
|
|
|
|
|
|
| |
* doc/grep.texi: Document this.
* src/grep.c: Move recent changes into pcresearch.c.
(P_MATCHER_INDEX): Remove.
(pcre_pattern_expand_backslash_d): Move from here ...
* src/pcresearch.c: ... to here.
(PCRE2_EXTRA_ASCII_BSD): Default to 0.
(Pcompile): Use PCRE2_EXTRA_ASCII_BSD if available,
and expand \d to [0-9] otherwise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to grep-3.9, the PCRE matcher had always treated \d just
like [0-9]. grep-3.9's fix for \w and \b mistakenly relaxed \d
to also match multibyte digits.
* src/grep.c (P_MATCHER_INDEX): Define enum.
(pcre_pattern_expand_backslash_d): New function.
(main): Call it for -P.
* NEWS (Bug fixes): Mention it.
* doc/grep.texi: Document it: with -P, \d matches only ASCII digits.
Provide a PCRE documentation URL and an example of how
to use (?s) with -z.
* tests/pcre-ascii-digits: New test.
* tests/Makefile.am (TESTS): Add that file name.
Reported as https://bugs.gnu.org/62267
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The idea is to defend against some adversary-in-the-middle attacks.
Also prefer git.savannah.gnu.org over its shorter alias, git.sv.gnu.org
to avoid a warning e.g., from git clone.
Also, drop any final ".git" suffix on the resulting URIs.
Inspired by Paul Eggert's nearly identical changes to coreutils.
Induced by running these commands:
git grep -l 'git clone git:'|xargs perl -pi -e \
's{(git clone) git://(\S+)/([^/]+)\b}{$1 https://$2/git/$3}'
git grep -l git.sv.gn \
|xargs perl -pi -e 's{git\.sv\.gnu}{git\.savannah\.gnu}'
perl -pi -e \
's{(url =) git://(\S+)/([^/.]+)(\.git)?\b}{$1 https://$2/git/$3}'\
.gitmodules
* .gitmodules: As above.
* HACKING: Likewise.
* README-hacking: Likewise.
* src/grep.c (main): Likewise.
|
|
|
|
|
|
| |
It’s obsolete in bleeding-edge Gnulib.
* src/grep.c, tests/get-mb-cur-max.c: Don’t include getprogname.h.
Instead, rely on stdlib.h to declare getprogname.
|
|
|
|
| |
* src/grep.c: Fix comments.
|
|
|
|
|
| |
* src/pcresearch.c (Pcompile): Issue a diagnostic and exit instead
of misbehaving if libpcre2 does not support the requested locale.
|
|
|
|
|
|
|
|
|
|
|
| |
Before this change, if linked with a PCRE library without unicode
any invocations of grep when using a UTF locale will error with:
grep: this version of PCRE2 does not have Unicode support
* src/pcresearch.c: Check whether Unicode was compiled in.
* tests/pcre-utf8-w: Add check to skip test.
* tests/pcre-utf8: Update check.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a serious bug affecting word-boundary and word-constituent regular
expressions when the desired match involves non-ASCII UTF8 characters.
* src/pcresearch.c: Set PCRE2_UCP together with PCRE2_UTF
* tests/pcre-utf8-w: New file.
* tests/Makefile.am (TESTS): Add it.
* NEWS (Bug fixes): Mention this.
* THANKS.in: Add Gro-Tsen and Karl Petterson.
Reported by Gro-Tsen https://twitter.com/gro_tsen/status/1610972356972875777
via Karl Pettersson in https://github.com/PCRE2Project/pcre2/issues/185
This bug was present from grep-2.5, when --perl-regexp (-P) support was added.
|
| |
|
|
|
|
|
| |
* src/dfasearch.c (GEAcompile): Don't call "re_set_syntax (syntax_bits)"
just before regex_compile; that function does the same thing already.
|
|
|
|
|
|
|
|
| |
* NEWS: Mention this.
* src/dfasearch.c (GEAcompile): Trim trailing newline from
the last pattern, even if it has back-references and follows
a pattern that lacks back-references.
* tests/backref: Add test for this bug.
|
|
|
|
|
|
|
|
| |
Prefer the standard C23 ckd_* macros to Gnulib’s *_WRAPV macros.
* bootstrap.conf (gnulib_modules): Add stdckdint.
* src/grep.c, src/kwset.c, src/pcresearch.c:
Include stdckdint.h, and prefer ckd_* to *_WRAPV.
Include intprops.h only if needed.
|
|
|
|
| |
* src/pcresearch.c: Include intprops.h.
|
|
|
|
|
|
| |
* bootstrap.conf (gnulib_modules): Add assert-h,
for static_assert.
* src/dfasearch.c (regex_compile): Prefer static_assert to verify.
|
|
|
|
|
|
| |
Gnulib’s stdbool module now provides C23-like semantics,
so there’s no longer any need to include stdbool.h.
* src/die.h, src/grep.h, src/kwset.h: Don’t include stdbool.h.
|
|
|
|
|
|
|
|
| |
* src/dfasearch.c (regex_compile): Parenthesize to avoid
this warning:
dfasearch.c:154:43: error: operator '?:' has lower precedence
than '|'; '|' will be evaluated first
[-Werror,-Wbitwise-conditional-parentheses]
|
|
|
|
|
|
| |
Problem reported by Jim Meyering in:
https://lists.gnu.org/r/grep-devel/2022-06/msg00012.html
* src/dfasearch.c (regex_compile): Fix memory leaks when SYNTAX_ONLY.
|
|
|
|
|
|
| |
* src/grep.c (main): Skip past leading backslash of a pattern that
begins with "\-". Inspired by a remark by Bruno Haible in:
https://lists.gnu.org/r/bug-gnulib/2022-06/msg00022.html
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch closes a longstanding security issue with GREP_COLOR that I
just noticed, where if the attacker has control over GREP_COLOR's
settings the attacker can trash the victim's terminal or have 'grep'
generate misleading output. For example, without the patch
the shell command:
GREP_COLOR="$(printf '31m\33[2J\33[31')" grep --color=always PATTERN
mucks with the screen, leaving behind only the trailing part of
the last matching line. With the patch, this GREP_COLOR is ignored.
* src/grep.c (main): Sanity-check GREP_COLOR contents the same way
GREP_COLORS values are checked, to not trash the user's terminal.
This follows up the recent fix to Bug#55641.
|
|
|
|
|
|
| |
This is to avoid confusion such as that reported by Cholden in:
https://bugs.gnu.org/55641
* src/grep.c (main): Warn if GREP_COLOR has an effect.
|
|
|
|
|
|
|
|
|
| |
These expressions are not portable and don’t always work as
expected, so warn about them. For example, “grep -E '(+)'”
doesn’t act like “grep '\(\+\)'”.
* src/dfasearch.c (GEAcompile): Warn about a repetition op at the
start of a regular expression or subexpression, except for ‘*’ in
BREs which is portable.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This papers over a problem reported by Benno Schulenberg and
Tomasz Dziendzielski <https://bugs.gnu.org/39678> involving
regular expressions like \a that have unspecified behavior.
* src/dfasearch.c (dfawarn): Just output a warning.
Don’t exit, as DFA_CONFUSING_BRACKETS_ERROR now
does that for us, and we need the ability to warn
without exiting to diagnose \a etc.
(GEAcompile): Use new dfa options DFA_CONFUSING_BRACKETS_ERROR and
DFA_STRAY_BACKSLASH_WARN.
|
|
|
|
|
|
|
| |
* src/dfasearch.c (dfawarn): Always call dfaerror now,
regardless of POSIXLY_CORRECT.
* tests/warn-char-classes: Omit test of POSIX.1-2008 behavior,
since POSIX.1-2017 allows the GNU behavior.
|
|
|
|
|
|
|
| |
* src/pcresearch.c (Pcompile): Remove recent workaround for PCRE2
bugs; apparently it’s not needed. This reverts back to where
things were before today. Suggested by Carlo Arenas in:
https://lists.gnu.org/r/grep-devel/2022-03/msg00006.html
|
|
|
|
|
|
|
| |
Potential problem reported by René Scharfe in:
https://lore.kernel.org/git/99b0adb6-26ba-293c-3a8f-679f59e7cb4d@web.de/T
* src/pcresearch.c (Pcompile): Mimic git grep’s workarounds
for PCRE2 bugs more closely; this is more conservative.
|
|
|
|
|
|
|
|
| |
Problem reported by Carlo Arenas in:
https://lists.gnu.org/r/grep-devel/2022-03/msg00004.html
* src/pcresearch.c (Pcompile) [PCRE2_MATCH_INVALID_UTF]:
In PCRE2 10.35 and earlier, disable start optimization if doing a
caseless UTF-8 search.
|
|
|
|
|
|
|
|
|
|
|
| |
When calling xpalloc (NULL, &n, incr_min, alloc_max, 1) with
nontrivial ALLOC_MAX, this must hold: N + INCR_MIN <= ALLOC_MAX.
With a very long line, it did not, and grep would mistakenly fail
with a report of "memory exhausted".
* src/grep.c (fillbuf): When using nontrivial ALLOC_MAX, ensure it
is at least N+INCR_MIN.
* tests/fillbuf-long-line: New file, to test for this.
* tests/Makefile.am (TESTS): Add its name.
|
|
|
|
|
|
|
|
| |
The comment was introduced in 500f07fee50ab16a70fe2946b85318020c7f4017 and
relates to absent cleanup code at the end of main(), not the code following
it. It relates to fallible flushing of stdout and related error handling,
but even then it doesn't explain much.
Copyright-paperwork-exempt: yes
|
| |
|
|
|
|
|
| |
* src/grep.c (grep): Implement this.
* tests/binary-file-matches: Add regression test.
|
|
|
|
| |
* src/pcresearch.c (PCRE2_SIZE_MAX): Default to SIZE_MAX.
|
|
|
|
| |
* src/pcresearch.c (Pcompile): Free ccontext when no longer needed.
|
|
|
|
|
|
| |
* src/pcresearch.c (Pcompile): Use ximalloc, not xcalloc,
and explicitly initialize the two slots that should be null.
This is more likely to catch future errors if we use valgrind.
|
|
|
|
|
|
|
|
|
|
| |
* src/pcresearch.c (struct pcre_comp): New member gcontext.
(private_malloc, private_free): New functions.
(jit_exec): It is OK to call pcre2_jit_stack_free (NULL), so simplify.
Use gcontext for allocation. Check for pcre2_jit_stack_create
failure, since sljit bypasses private_malloc. Redo to avoid two
‘continue’s.
(Pcompile): Create and use gcontext.
|
|
|
|
| |
* src/pcresearch.c (Pcompile): Simplify since ‘die’ cannot return.
|
|
|
|
|
|
| |
* src/pcresearch.c (Pcompile): If available, use
PCRE2_EXTRA_MATCH_LINE instead of doing it by hand.
Simplify construction of substitute regular expression.
|
|
|
|
|
|
|
|
| |
* src/pcresearch.c (struct pcre_comp, jit_exec, Pexecute):
Prefer signed to unsigned types when either will do.
(jit_exec): Use INT_MULTIPLY_WRAPV instead of doing it by hand.
(Pexecute): Omit line length limit test that is no longer
needed with PCRE2.
|
|
|
|
|
|
|
| |
* src/pcresearch.c (bad_utf8_from_pcre2): New function. Fix bug
where PCRE2_ERROR_UTF8_ERR1 was not treated as an encoding error.
Improve performance when PCRE2_MATCH_INVALID_UTF is defined.
(Pexecute): Use it.
|
|
|
|
|
| |
* src/pcresearch.c (Pcompile): Improve comments re
pcre2_get_error_message buffer.
|
|
|
|
|
| |
* src/pcresearch.c (jit_exec): Remove arbitrary INT_MAX limit on JIT
stack size.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Mostly a bug by bug translation of the original code to the PCRE2 API.
Code still could do with some optimizations but should be good as a
starting point.
The API changes the sign of some types and therefore some ugly casts
were needed, some of the changes are just to make sure all variables
fit into the newer types better.
Includes backward compatibility and could be made to build all the way
to 10.00, but assumes a recent enough version and has been tested with
10.23 (from CentOS 7, the oldest).
Performance seems equivalent, and it also seems functionally complete.
* m4/pcre.m4 (gl_FUNC_PCRE): Check for PCRE2, not the original PCRE.
* src/pcresearch.c (struct pcre_comp, jit_exec)
(Pcompile, Pexecute):
Use PCRE2, not the original PCRE.
* tests/filename-lineno.pl: Adjust to match PCRE2 diagnostics.
|
|
|
|
|
|
| |
Problem reported by Carlo Marcelo Arenas Belón (Bug#51710).
* src/pcresearch.c (jit_exec): Don’t attempt to grow the JIT stack
over INT_MAX - 8 * 1024.
|
|
|
|
| |
* src/system.h: Update decls to match current Gnulib.
|