summaryrefslogtreecommitdiff
path: root/bootstrap
diff options
context:
space:
mode:
authorNorihiro Tanaka <noritnk@kcn.ne.jp>2019-12-22 16:39:09 -0800
committerPaul Eggert <eggert@cs.ucla.edu>2019-12-22 16:40:08 -0800
commitabb7f4f2325f26f930ff59b702fe42568a8e81e7 (patch)
tree734e40a3a595c4f491e102b5883fb8829cc2984a /bootstrap
parentcf09252295c554dd3eba4cdb8eb53911fb495f40 (diff)
downloadgrep-abb7f4f2325f26f930ff59b702fe42568a8e81e7.tar.gz
grep: grouping of a pattern with multiple lines
When grep uses regex, it splits a pattern with multiple lines by newline character into fragments. Compilation and execution run for each fragment. That causes slowdown. By this change, each fragment is divided into groups by whether the fragment includes back references. A fragment with back references constitutes group, and all fragments that lack back references also constitute a group. This change extremely speeds-up following case. $ seq -f '%040g' 0 9999 | sed '1s/$/\\(0\\)\\1/' >pat $ yes 00000000000000000000000000000000000000000x | head -10000 >in $ time -p env LC_ALL=C src/grep -f pat in * src/dfasearch.c (find_backref_in_pattern, regex_compile): New functions. (GEAcompile): Use the new functions to group fragments as mentioned above.
Diffstat (limited to 'bootstrap')
0 files changed, 0 insertions, 0 deletions