summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>2013-03-17 17:13:14 +0000
committerph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>2013-03-17 17:13:14 +0000
commitd87a4298d0019e7a31506fa67e1537c7f4c442a1 (patch)
tree86e394e1b94010157cd7aa60c92f466580796bcb
parent398a48b6d0fe822f10d56f71a30328a1f8f422b1 (diff)
downloadpcre-d87a4298d0019e7a31506fa67e1537c7f4c442a1.tar.gz
Document new multiple backtracking verb behaviour.
git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1293 2f5784b3-3f2a-0410-8824-cb99058d5e15
-rw-r--r--doc/pcrecompat.316
-rw-r--r--doc/pcrepattern.316
2 files changed, 19 insertions, 13 deletions
diff --git a/doc/pcrecompat.3 b/doc/pcrecompat.3
index 4b191d6..8b05b65 100644
--- a/doc/pcrecompat.3
+++ b/doc/pcrecompat.3
@@ -100,11 +100,17 @@ encountered in a successful positive assertion \fIis\fP passed back when a
match succeeds (compare capturing parentheses in assertions). Note that such
subpatterns are processed as anchored at the point where they are tested.
.P
-11. There are some differences that are concerned with the settings of captured
+11. If a pattern contains more than one backtracking control verb, the first
+one that is backtracked onto acts. For example, in the pattern
+A(*COMMIT)B(*PRUNE)C a failure in B triggers (*COMMIT), but a failure in C
+triggers (*PRUNE). Perl's behaviour is more complex; in many cases it is the
+same as PCRE, but there are examples where it differs.
+.P
+12. There are some differences that are concerned with the settings of captured
strings when part of a pattern is repeated. For example, matching "aba" against
the pattern /^(a(b)?)+$/ in Perl leaves $2 unset, but in PCRE it is set to "b".
.P
-12. PCRE's handling of duplicate subpattern numbers and duplicate subpattern
+13. PCRE's handling of duplicate subpattern numbers and duplicate subpattern
names is not as general as Perl's. This is a consequence of the fact the PCRE
works internally just with numbers, using an external table to translate
between numbers and names. In particular, a pattern such as (?|(?<a>A)|(?<b)B),
@@ -114,18 +120,18 @@ would not be possible to distinguish which parentheses matched, because both
names map to capturing subpattern number 1. To avoid this confusing situation,
an error is given at compile time.
.P
-13. Perl recognizes comments in some places that PCRE does not, for example,
+14. Perl recognizes comments in some places that PCRE does not, for example,
between the ( and ? at the start of a subpattern. If the /x modifier is set,
Perl allows white space between ( and ? but PCRE never does, even if the
PCRE_EXTENDED option is set.
.P
-14. In PCRE, the upper/lower case character properties Lu and Ll are not
+15. In PCRE, the upper/lower case character properties Lu and Ll are not
affected when case-independent matching is specified. For example, \ep{Lu}
always matches an upper case letter. I think Perl has changed in this respect;
in the release at the time of writing (5.16), \ep{Lu} and \ep{Ll} match all
letters, regardless of case, when case independence is specified.
.P
-15. PCRE provides some extensions to the Perl regular expression facilities.
+16. PCRE provides some extensions to the Perl regular expression facilities.
Perl 5.10 includes new features that are not in earlier versions of Perl, some
of which (such as named parentheses) have been in PCRE for some time. This list
is with respect to Perl 5.10:
diff --git a/doc/pcrepattern.3 b/doc/pcrepattern.3
index 93a0d79..865325c 100644
--- a/doc/pcrepattern.3
+++ b/doc/pcrepattern.3
@@ -2973,16 +2973,16 @@ unanchored pattern). (*SKIP) is similar, except that the advance may be more
than one character. (*COMMIT) is the strongest, causing the entire match to
fail.
.P
-If more than one such verb is present in a pattern, the "strongest" one wins.
-For example, consider this pattern, where A, B, etc. are complex pattern
-fragments:
+If more than one such verb is present in a pattern, the one that is backtracked
+onto first acts. For example, consider this pattern, where A, B, etc. are
+complex pattern fragments:
.sp
- (A(*COMMIT)B(*THEN)C|D)
+ (A(*COMMIT)B(*THEN)C|ABD)
.sp
-Once A has matched, PCRE is committed to this match, at the current starting
-position. If subsequently B matches, but C does not, the normal (*THEN) action
-of trying the next alternative (that is, D) does not happen because (*COMMIT)
-overrides.
+If A matches but B fails, the backtrack to (*COMMIT) causes the entire match to
+fail. However, if A and B match, but C fails, the backtrack to (*THEN) causes
+the next alternative (ABD) to be tried. This behaviour is consistent, but is
+not always the same as Perl's.
.
.
.SH "SEE ALSO"