diff options
author | Jim Meyering <meyering@fb.com> | 2020-11-25 16:49:51 -0800 |
---|---|---|
committer | Jim Meyering <meyering@fb.com> | 2020-11-26 10:32:00 -0800 |
commit | 192e59903c7d313bb47de3d5c15b3dc634e98c5f (patch) | |
tree | 94f500ed106d001dcbe89d6aed7da086bde2708b /NEWS | |
parent | 9095818f2669ceed97c41d72c5269f60cbfa9260 (diff) | |
download | grep-192e59903c7d313bb47de3d5c15b3dc634e98c5f.tar.gz |
grep: avoid performance regression with many patterns
* src/grep.c (hash_pattern): Switch from PJW to DJB2, to avoid an
O(N) to O(N^2) performance regression due to hash collisions with
patterns from e.g., seq 500000|tr 0-9 A-J
Reported by Frank Heckenbach in https://bugs.gnu.org/44754
* NEWS (Bug fixes): Mention it.
* tests/hash-collision-perf: New file.
* tests/Makefile.am (TESTS): Add it.
Diffstat (limited to 'NEWS')
-rw-r--r-- | NEWS | 7 |
1 files changed, 7 insertions, 0 deletions
@@ -2,6 +2,13 @@ GNU grep NEWS -*- outline -*- * Noteworthy changes in release ?.? (????-??-??) [?] +** Bug fixes + + Preprocessing N patterns would take at least O(N^2) time when too many + patterns hashed to too few buckets. This now takes seconds, not days: + : | grep -Ff <(seq 6400000 | tr 0-9 A-J) + [Bug#44754 introduced in grep 3.5] + * Noteworthy changes in release 3.6 (2020-11-08) [stable] |