delta/grep.git - git.savannah.gnu.org: git/grep.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	grep: prefer signed integers	Paul Eggert	2021-11-14	1	-13/+11
\| \| \| \| \| \| \| \|	* src/pcresearch.c (struct pcre_comp, jit_exec, Pexecute): Prefer signed to unsigned types when either will do. (jit_exec): Use INT_MULTIPLY_WRAPV instead of doing it by hand. (Pexecute): Omit line length limit test that is no longer needed with PCRE2.
*	grep: speed up, fix bad-UTF8 check with -P	Paul Eggert	2021-11-14	1	-2/+14
\| \| \| \| \| \| \|	* src/pcresearch.c (bad_utf8_from_pcre2): New function. Fix bug where PCRE2_ERROR_UTF8_ERR1 was not treated as an encoding error. Improve performance when PCRE2_MATCH_INVALID_UTF is defined. (Pexecute): Use it.
*	grep: improve pcre2_get_error_message comments	Paul Eggert	2021-11-14	1	-2/+3
\| \| \| \| \|	* src/pcresearch.c (Pcompile): Improve comments re pcre2_get_error_message buffer.
*	grep: Don’t limit jitstack_max to INT_MAX	Paul Eggert	2021-11-14	1	-1/+7
\| \| \| \| \|	* src/pcresearch.c (jit_exec): Remove arbitrary INT_MAX limit on JIT stack size.
*	maint: minor rewording and reindenting	Paul Eggert	2021-11-14	5	-29/+33
\|
*	grep: migrate to pcre2	Carlo Marcelo Arenas Belón	2021-11-14	5	-146/+138
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Mostly a bug by bug translation of the original code to the PCRE2 API. Code still could do with some optimizations but should be good as a starting point. The API changes the sign of some types and therefore some ugly casts were needed, some of the changes are just to make sure all variables fit into the newer types better. Includes backward compatibility and could be made to build all the way to 10.00, but assumes a recent enough version and has been tested with 10.23 (from CentOS 7, the oldest). Performance seems equivalent, and it also seems functionally complete. * m4/pcre.m4 (gl_FUNC_PCRE): Check for PCRE2, not the original PCRE. * src/pcresearch.c (struct pcre_comp, jit_exec) (Pcompile, Pexecute): Use PCRE2, not the original PCRE. * tests/filename-lineno.pl: Adjust to match PCRE2 diagnostics.
*	maint: update README-prereq for Gperf, Rsync, Wget	Paul Eggert	2021-11-11	1	-1/+3
\|
*	tests: fix pcre test typo	Paul Eggert	2021-11-10	1	-2/+4
\| \| \| \|	* tests/pcre-context: Initialize ‘fail’ earlier.
*	tests: fix test logic for pcre-context	Carlo Marcelo Arenas Belón	2021-11-10	1	-18/+22
\| \| \| \| \| \| \| \| \| \| \| \| \|	Included in the original bug #20957, but corrupted somehow in transit as the required NUL characters are missing. Add a simpler version of the test case that uses plain characters and match the -z data and output to show the equivalence. Note the output is still not correct as it is missing the expected LF characters, but a full fix will have to wait until PCRE2. Fixes Bug#51735.
*	grep: work around PCRE bug	Paul Eggert	2021-11-09	1	-1/+4
\| \| \| \| \| \|	Problem reported by Carlo Marcelo Arenas Belón (Bug#51710). * src/pcresearch.c (jit_exec): Don’t attempt to grow the JIT stack over INT_MAX - 8 * 1024.
*	build: update gnulib submodule to latest	Paul Eggert	2021-11-09	1	-0/+0
\|
*	maint: modernize README-{hacking,prereq}	Paul Eggert	2021-10-30	2	-71/+82
\|
*	build: update gnulib submodule to latest	Paul Eggert	2021-10-28	1	-0/+0
\|
*	doc: document interval expression limitations	Paul Eggert	2021-08-27	1	-1/+14
\| \| \| \| \|	* doc/grep.texi (Basic vs Extended, Performance): Document limitations of interval expressions (Bug#44538).
*	build: update gnulib submodule to latest	Paul Eggert	2021-08-27	2	-2/+2
\| \| \| \|	* src/system.h: Update decls to match current Gnulib.
*	grep: prefer signed to unsigned integers	Paul Eggert	2021-08-25	10	-277/+294
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This improves runtime checking for integer overflow when compiling with gcc -fsanitize=undefined and the like. It also avoids the need for some integer casts, which can be error-prone. * bootstrap.conf (gnulib_modules): Add idx. * src/dfasearch.c (struct dfa_comp, kwsmusts): (possible_backrefs_in_pattern, regex_compile, GEAcompile) (EGexecute): * src/grep.c (struct patloc, patlocs_allocated, patlocs_used) (n_patterns, update_patterns, pattern_file_name, poison_len) (asan_poison, fwrite_errno, compile_fp_t, execute_fp_t) (buf_has_encoding_errors, buf_has_nulls, file_must_have_nulls) (bufalloc, pagesize, all_zeros, fillbuf, nlscan) (print_line_head, print_line_middle, print_line_tail, grepbuf) (grep, contains_encoding_error, fgrep_icase_available) (fgrep_icase_charlen, fgrep_to_grep_pattern, try_fgrep_pattern) (main): * src/kwsearch.c (struct kwsearch, Fcompile, Fexecute): * src/kwset.c (struct trie, struct kwset, kwsalloc, kwsincr) (kwswords, treefails, memchr_kwset, acexec_trans, kwsexec) (treedelta, kwsprep, bm_delta2_search, bmexec_trans, bmexec) (acexec): * src/kwset.h (struct kwsmatch): * src/pcresearch.c (Pcompile, Pexecute): * src/search.h (mb_clen): * src/searchutils.c (kwsinit, mb_goback, wordchars_count) (wordchars_size, wordchar_next, wordchar_prev): Prefer idx_t to size_t or ptrdiff_t for nonnegative sizes, and prefer ptrdiff_t to size_t for sizes plus error values. * src/grep.c (uword_size): New constant, used for signed size calculations. (totalnl, add_count, totalcc, print_offset, print_line_head, grep): Prefer intmax_t to uintmax_t for wide integer calculations. (fgrep_icase_charlen): Prefer ptrdiff_t to int for size offsets. * src/grep.h: Include idx.h. * src/search.h (imbrlen): New function, like mbrlen except with idx_t and ptrdiff_t.
*	grep: scan back thru UTF-8 a bit faster	Paul Eggert	2021-08-24	1	-6/+13
\| \| \| \| \| \| \| \|	* src/searchutils.c (mb_goback): When scanning backward through UTF-8, check the length implied by the putative byte 1 before bothering to invoke mb_clen. This length check also lets us use mbrlen directly rather than calling mb_clen, which would eventually defer to mbrlen anyway.
*	grep: tweak mb_goback performance	Paul Eggert	2021-08-24	1	-5/+11
\| \| \| \| \| \| \|	* src/searchutils.c (mb_goback): Set *MBCLEN only in non-UTF-8 encodings, since that’s the only time it’s needed, and this lets us see more clearly that the UTF-8 clen value is not useful to the caller.
*	grep: tweak wordchar_prev performance	Paul Eggert	2021-08-24	1	-2/+1
\| \| \| \| \|	* src/searchutils.c (wordchar_prev): Tweak performance by using a value already in a local variable rather than consulting a table.
*	grep: tweak mb_goback and comment it better	Paul Eggert	2021-08-24	1	-13/+30
\| \| \| \| \| \|	* src/searchutils.c (mb_goback): Improve the comment to better describe this confusing function. And remove an unnecessary test of cur vs end.
*	grep: omit unused maxd member	Paul Eggert	2021-08-24	1	-4/+0
\| \| \| \|	* src/kwset.c (struct kwset.maxd): Remove. All uses removed.
*	grep: avoid some size_t casts	Paul Eggert	2021-08-24	3	-10/+10
\| \| \| \| \| \| \| \| \|	This helps move the code away from unsigned types. * src/grep.c (buf_has_encoding_errors, contains_encoding_error): * src/searchutils.c (mb_goback): Compare to MB_LEN_MAX, not to (size_t) -2. This is a bit safer anyway, as grep relies on MB_LEN_MAX limits elsewhere. * src/search.h (mb_clen): Compare to -2 before converting to size_t.
*	tests: mb-non-UTF8-perf-Fw: use head rather than sed	Jim Meyering	2021-08-22	2	-2/+2
\| \| \| \| \| \|	* tests/mb-non-UTF8-perf-Fw: Use head -n 10000000 rather than the work-alike sed command. This provides a 4x speedup and saves 0.5s. * tests/null-byte: Likewise.
*	grep: avoid sticky problem with ‘-f - -f -’	Paul Eggert	2021-08-21	1	-6/+11
\| \| \| \| \| \| \| \|	Inspired by bug#50129 even though this is a different bug. * src/grep.c (main): For ‘-f -’, use clearerr (stdin) after reading, so that ‘grep -f - -f -’ reads stdin twice even when stdin is a tty. Also, for ‘-f FILE’, report any I/O error when closing FILE.
*	tests: port mb-non-UTF8-perf-Fw to strict POSIX	Paul Eggert	2021-08-18	1	-1/+1
\| \| \| \| \|	* tests/mb-non-UTF8-perf-Fw: Prefer ‘sed 10q’ to ‘head -10’, which doesn’t conform to POSIX.
*	grep: djb2 correction	Paul Eggert	2021-08-18	1	-1/+9
\| \| \| \| \|	Problem reported by Alex Murray (bug#50093). * src/grep.c (hash_pattern): Use a nonzero initial value.
*	doc: modernize portability advice	Paul Eggert	2021-08-16	1	-20/+5
\| \| \| \| \| \| \| \|	* doc/grep.texi (General Output Control, Basic vs Extended): No need to complicate the portability advice by talking about 7th edition grep, since it’s no longer a practical porting target. Instead, mention only Solaris 10 grep, the last practical holdout of somewhat-traditional grep.
*	egrep, fgrep: now obsolete	Paul Eggert	2021-08-16	8	-39/+38
\| \| \| \| \| \| \| \| \| \| \| \| \|	* NEWS: Mention this (see bug#49996). * doc/Makefile.am (egrep.1 fgrep.1): Remove. All uses removed. * doc/grep.in.1, doc/grep.texi (grep Programs): Remove documentation for egrep, fgrep. * doc/grep.texi (Usage): Add FAQ for egrep and fgrep. * src/Makefile.am (shell_does_substrings): Substitute for ${0##/}, not for ${0%/\} (which was not being used anyway). * src/egrep.sh: Issue an obsolescence warning. * tests/fedora: Use "grep -F" instead of "fgrep" in diagnostics, as this tests "grep -F" not "fgrep".
*	doc: update cites and authors	Paul Eggert	2021-08-14	4	-30/+38
\|
*	maint: post-release administrivia	Jim Meyering	2021-08-14	3	-2/+5
\| \| \| \| \| \|	* NEWS: Add header line for next release. * .prev-version: Record previous version. * cfg.mk (old_NEWS_hash): Auto-update.
*	version 3.7v3.7	Jim Meyering	2021-08-14	1	-1/+1
\| \| \| \|	* NEWS: Record release date.
*	tests: provide an awk-based seq replacement	Jim Meyering	2021-08-09	3	-6/+17
\| \| \| \| \| \| \| \|	...so we can continue to use seq, but the wrapper when needed. * tests/init.cfg (seq): Some systems lask seq. Provide a replacement. * tests/hash-collision-perf: Use seq once again. * tests/long-pattern-perf: Likewise. And remove a comment about seq.
*	grep: simplify EGexecute	Paul Eggert	2021-08-09	1	-2/+1
\| \| \| \| \|	* src/dfasearch.c (EGexecute): Remove a label and goto. This also makes the machine code a bit shorter, on x86-64 gcc.
*	grep: simplify data movement slightly	Paul Eggert	2021-08-09	1	-11/+5
\| \| \| \|	* src/grep.c (fillbuf): Simplify movement of saved data.
*	grep: pointer-integer cast nit	Paul Eggert	2021-08-09	1	-2/+2
\| \| \| \| \| \|	* src/grep.c (ALIGN_TO): When converting pointers to unsigned integers, convert to uintptr_t not size_t, as size_t in theory might be too narrow.
*	tests: use awk, not seq	Paul Eggert	2021-08-09	2	-5/+6
\| \| \| \| \| \| \|	Portability problem reported by Dagobert Michelsen in: https://lists.gnu.org/r/grep-devel/2021-08/msg00004.html * tests/hash-collision-perf, tests/long-pattern-perf: Don’t assume seq is installed; use awk instead.
*	build: update gnulib to latest	Jim Meyering	2021-08-08	1	-0/+0
\|
*	build: update gnulib to latest	Jim Meyering	2021-08-08	2	-5/+20
\|
*	doc: usage: --group-separator/--no-group-separator	Kevin Locke	2021-08-06	1	-0/+2
\| \| \| \| \|	* src/grep.c (usage): Document --group-separator and --no-group-separator.
*	doc: man: add --group-separator/--no-group-separator	Kevin Locke	2021-08-06	1	-0/+20
\| \| \| \| \| \|	* doc/grep.in.1: Add copy of docs for --group-separator from doc/grep.texi. Add copy of docs for --no-group-separator from doc/grep.texi.
*	build: update gnulib to latest	Jim Meyering	2021-08-06	1	-0/+0
\|
*	doc: note that -H is a GNU extension in man page, too	Mateusz Okulus	2021-06-19	1	-0/+1
\| \| \| \|	* doc/grep.in.1 (-H): Mention that this is a GNU extension.
*	build: update gnulib submodule to latest	Paul Eggert	2021-06-13	1	-0/+0
\|
*	build: update gnulib submodule to latest	Paul Eggert	2021-06-11	1	-0/+0
\|
*	doc: improve examples and wording	Paul Eggert	2021-06-10	1	-9/+8
\| \| \| \| \|	* doc/grep.texi (The Backslash Character and Special Expressions) (Usage): Improve doc (Bug#48948).
*	doc: man: fix -L description and improve -l's	Jim Meyering	2021-01-31	2	-4/+2
\| \| \| \| \| \| \| \| \| \|	* doc/grep.texi (-L): Remove erroneous sentence about stopping early. With -L, grep cannot stop scanning early. (-l): Tweak existing wording. * doc/grep.in.1: Remove the -L sentence here, too. (-l): Copy the sentence from grep.texi, to clarify: it's only per-file scanning that stops upon match. Reported by Robert Bruntz in http://debbugs.gnu.org/46179
*	build: avoid long-string warnings in gnulib tests	Jim Meyering	2021-01-05	1	-0/+7
\| \| \| \| \|	* configure.ac (GNULIB_TEST_WARN_CFLAGS): Add -Woverlength-strings to avoid clang warnings.
*	doc: further clarify regexp structure	Paul Eggert	2021-01-01	1	-19/+45
\| \| \| \| \| \|	* doc/grep.texi (Fundamental Structure) (Back-references and Subexpressions, Basic vs Extended): Further clarifications.
*	maint: copy bootstrap, tests/init.sh from Gnulib	Paul Eggert	2021-01-01	2	-4/+18
\|
*	doc: update grep.texi cite to 2021	Paul Eggert	2021-01-01	1	-1/+1
\|