diff options
author | Jim Meyering <meyering@fb.com> | 2023-03-18 08:28:36 -0700 |
---|---|---|
committer | Jim Meyering <meyering@meta.com> | 2023-03-18 17:08:09 -0700 |
commit | c83ffc197ec483c6f44f907346f34127ec044ef0 (patch) | |
tree | d3b01f6a00fe5a9573f596e45c4f5ad8b8a856b5 /NEWS | |
parent | 7979ea7ddbf83f3203d53b6351c3717ce0af91c4 (diff) | |
download | grep-c83ffc197ec483c6f44f907346f34127ec044ef0.tar.gz |
grep: -P (--perl-regexp) \d: match only ASCII digits
Prior to grep-3.9, the PCRE matcher had always treated \d just
like [0-9]. grep-3.9's fix for \w and \b mistakenly relaxed \d
to also match multibyte digits.
* src/grep.c (P_MATCHER_INDEX): Define enum.
(pcre_pattern_expand_backslash_d): New function.
(main): Call it for -P.
* NEWS (Bug fixes): Mention it.
* doc/grep.texi: Document it: with -P, \d matches only ASCII digits.
Provide a PCRE documentation URL and an example of how
to use (?s) with -z.
* tests/pcre-ascii-digits: New test.
* tests/Makefile.am (TESTS): Add that file name.
Reported as https://bugs.gnu.org/62267
Diffstat (limited to 'NEWS')
-rw-r--r-- | NEWS | 10 |
1 files changed, 10 insertions, 0 deletions
@@ -2,6 +2,16 @@ GNU grep NEWS -*- outline -*- * Noteworthy changes in release ?.? (????-??-??) [?] +** Bug fixes + + With -P, \d now matches only ASCII digits, regardless of PCRE + options/modes. The changes in grep-3.9 to make \b and \w work + properly had the undesirable side effect of making \d also match + e.g., the Arabic digits: ٠١٢٣٤٥٦٧٨٩. With grep-3.9, -P '\d+' + would match that ten-digit (20-byte) string. Now, to match such + a digit, you would use \p{Nd}. + [bug introduced in grep 3.9] + * Noteworthy changes in release 3.9 (2023-03-05) [stable] |