summaryrefslogtreecommitdiff
path: root/NEWS
diff options
context:
space:
mode:
authorJim Meyering <meyering@fb.com>2023-03-18 08:28:36 -0700
committerJim Meyering <meyering@meta.com>2023-03-18 17:08:09 -0700
commitc83ffc197ec483c6f44f907346f34127ec044ef0 (patch)
treed3b01f6a00fe5a9573f596e45c4f5ad8b8a856b5 /NEWS
parent7979ea7ddbf83f3203d53b6351c3717ce0af91c4 (diff)
downloadgrep-c83ffc197ec483c6f44f907346f34127ec044ef0.tar.gz
grep: -P (--perl-regexp) \d: match only ASCII digits
Prior to grep-3.9, the PCRE matcher had always treated \d just like [0-9]. grep-3.9's fix for \w and \b mistakenly relaxed \d to also match multibyte digits. * src/grep.c (P_MATCHER_INDEX): Define enum. (pcre_pattern_expand_backslash_d): New function. (main): Call it for -P. * NEWS (Bug fixes): Mention it. * doc/grep.texi: Document it: with -P, \d matches only ASCII digits. Provide a PCRE documentation URL and an example of how to use (?s) with -z. * tests/pcre-ascii-digits: New test. * tests/Makefile.am (TESTS): Add that file name. Reported as https://bugs.gnu.org/62267
Diffstat (limited to 'NEWS')
-rw-r--r--NEWS10
1 files changed, 10 insertions, 0 deletions
diff --git a/NEWS b/NEWS
index 803e14b3..a24cebd8 100644
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,16 @@ GNU grep NEWS -*- outline -*-
* Noteworthy changes in release ?.? (????-??-??) [?]
+** Bug fixes
+
+ With -P, \d now matches only ASCII digits, regardless of PCRE
+ options/modes. The changes in grep-3.9 to make \b and \w work
+ properly had the undesirable side effect of making \d also match
+ e.g., the Arabic digits: ٠١٢٣٤٥٦٧٨٩. With grep-3.9, -P '\d+'
+ would match that ten-digit (20-byte) string. Now, to match such
+ a digit, you would use \p{Nd}.
+ [bug introduced in grep 3.9]
+
* Noteworthy changes in release 3.9 (2023-03-05) [stable]