summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJim Meyering <meyering@fb.com>2018-12-20 20:04:53 -0800
committerJim Meyering <meyering@fb.com>2018-12-20 20:26:16 -0800
commit3d6a2841442c049ef999b8cd755a7e29e7ae4498 (patch)
tree9339b522c3aa71648fda91d3ccdc445889652765
parent3c0a36e514237132db711bfef57a74c64592c4e2 (diff)
downloadgrep-3d6a2841442c049ef999b8cd755a7e29e7ae4498.tar.gz
grep: fix \b DFA-bug in C locale
Under some conditions, \b would mistakenly fail to match, e.g. echo 123-x|LC_ALL=C grep '.\bx' * NEWS (Bug fixes): Mention it * gnulib: Update to latest, for DFA regression fix. * tests/word-delim-multibyte: Add a test for the dfa.c regression.
-rw-r--r--NEWS10
m---------gnulib0
-rwxr-xr-xtests/word-delim-multibyte10
3 files changed, 20 insertions, 0 deletions
diff --git a/NEWS b/NEWS
index 8b332aaf..7d0a3d56 100644
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,16 @@ GNU grep NEWS -*- outline -*-
* Noteworthy changes in release ?.? (????-??-??) [?]
+** Bug fixes
+
+ Some uses of \b in the C locale and with the DFA matcher would fail, e.g.,
+ the following would print nothing (it should print the input line):
+ echo 123-x|LC_ALL=C grep '.\bx'
+ Using a multibyte locale, using certain regexp constructs (some ranges,
+ backreferences), or forcing use of the PCRE matcher via --perl-regexp (-P)
+ would avoid the bug.
+ [bug introduced in grep 2.3]
+
* Noteworthy changes in release 3.2 (2018-12-20) [stable]
diff --git a/gnulib b/gnulib
-Subproject b823b5dc51510ba59dd0d00d677070b9a2e1180
+Subproject 5d6a3cdd5c312e77a6d0f0848e3cb79a52e0865
diff --git a/tests/word-delim-multibyte b/tests/word-delim-multibyte
index 0bc0e333..7d2c433c 100755
--- a/tests/word-delim-multibyte
+++ b/tests/word-delim-multibyte
@@ -24,4 +24,14 @@ grep -w "$e_acute" in > out 2>err || fail=1
compare out in || fail=1
compare /dev/null err || fail=1
+# Also ensure that this works in both the C locale and that multibyte one.
+# In the C locale, it failed due to a dfa.c regression in grep-3.2.
+echo 123-x > in || framework_failure_
+
+for locale in C en_US.UTF-8; do
+ LC_ALL=$locale grep '.\bx' in > out 2>err || fail=1
+ compare out in || fail=1
+ compare /dev/null err || fail=1
+done
+
Exit $fail