summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorJim Meyering <meyering@fb.com>2023-03-18 23:25:03 -0700
committerJim Meyering <meyering@meta.com>2023-03-19 13:36:23 -0700
commit98ee05b4ddfee5c1db2248bdb060a2cd64bf75fa (patch)
tree3f698eb54e1cd35347d16ad14270261e3c0a95d3 /doc
parent99330c2b1dc8b619dff8a5a6a35f524d382508c8 (diff)
downloadgrep-98ee05b4ddfee5c1db2248bdb060a2cd64bf75fa.tar.gz
grep: -P (--perl-regexp) \D once again works like [^0-9]
* NEWS: Mention \D, too. * doc/grep.texi: Likewise * src/pcresearch.c (pcre_pattern_expand_backslash_d): Handle \D. Also, ifdef-out this new function and its call site when not needed. * tests/pcre-ascii-digits: Test \D, too. Tighten one test by using returns_ 1. Add comments and tests that work only with 10.43 and newer. Paul Eggert raised the issue of \D in https://bugs.gnu.org/62267#8
Diffstat (limited to 'doc')
-rw-r--r--doc/grep.texi20
1 files changed, 7 insertions, 13 deletions
diff --git a/doc/grep.texi b/doc/grep.texi
index 8a0aef51..7a00adda 100644
--- a/doc/grep.texi
+++ b/doc/grep.texi
@@ -1144,21 +1144,15 @@ combined with the @option{-z} (@option{--null-data}) option, and note that
For documentation, refer to @url{https://www.pcre.org/}, with these caveats:
@itemize
@item
-@samp{\d} matches only the ten ASCII digits, regardless of locale.
+@samp{\d} matches only the ten ASCII digits
+(and @samp{\D} matches the complement), regardless of locale.
Use @samp{\p@{Nd@}} to also match non-ASCII digits.
-When @command{grep} is built with PCRE2 10.42 and earlier, @samp{\d}
-ignores in-regexp directives like @samp{(?aD)} and matches only ASCII
-digits regardless of these directives. However, later versions of
-PCRE2 likely will fix this, and the plan is for @command{grep} to
-respect those directives if possible.
-@c Using PCRE2 git commit pcre2-10.40-112-g6277357, this demonstrates
-@c the equivalent of how grep could use PCRE2_EXTRA_ASCII_BSD to make \d's
-@c ASCII-only behavior the default:
-@c $ LC_ALL=en_US.UTF-8 ./pcre2grep -u '(?aD)^\d+' <<< '٠١٢٣٤٥٦٧٨٩'
-@c [Exit 1]
-@c $ LC_ALL=en_US.UTF-8 ./pcre2grep -u '^\d+' <<< '٠١٢٣٤٥٦٧٨٩'
-@c ٠١٢٣٤٥٦٧٨٩
+When @command{grep} is built with PCRE2 10.42 and earlier,
+@samp{\d} and @samp{\D} ignore in-regexp directives like @samp{(?aD)}
+and work like @samp{[0-9]} and @samp{[^0-9]} respectively.
+However, later versions of PCRE2 likely will fix this,
+and the plan is for @command{grep} to respect those directives if possible.
@item
Although PCRE tracks the syntax and semantics of Perl's regular