diff options
author | Norihiro Tanaka <noritnk@kcn.ne.jp> | 2019-12-22 16:39:09 -0800 |
---|---|---|
committer | Paul Eggert <eggert@cs.ucla.edu> | 2019-12-22 16:40:08 -0800 |
commit | abb7f4f2325f26f930ff59b702fe42568a8e81e7 (patch) | |
tree | 734e40a3a595c4f491e102b5883fb8829cc2984a /bootstrap | |
parent | cf09252295c554dd3eba4cdb8eb53911fb495f40 (diff) | |
download | grep-abb7f4f2325f26f930ff59b702fe42568a8e81e7.tar.gz |
grep: grouping of a pattern with multiple lines
When grep uses regex, it splits a pattern with multiple lines by
newline character into fragments. Compilation and execution run for
each fragment. That causes slowdown. By this change, each fragment is
divided into groups by whether the fragment includes back references.
A fragment with back references constitutes group, and all fragments
that lack back references also constitute a group.
This change extremely speeds-up following case.
$ seq -f '%040g' 0 9999 | sed '1s/$/\\(0\\)\\1/' >pat
$ yes 00000000000000000000000000000000000000000x | head -10000 >in
$ time -p env LC_ALL=C src/grep -f pat in
* src/dfasearch.c (find_backref_in_pattern, regex_compile):
New functions.
(GEAcompile): Use the new functions to group fragments
as mentioned above.
Diffstat (limited to 'bootstrap')
0 files changed, 0 insertions, 0 deletions