diff options
Diffstat (limited to 'pod/perlre.pod')
-rw-r--r-- | pod/perlre.pod | 65 |
1 files changed, 48 insertions, 17 deletions
diff --git a/pod/perlre.pod b/pod/perlre.pod index 6e68bcd1db..b9216c156c 100644 --- a/pod/perlre.pod +++ b/pod/perlre.pod @@ -594,20 +594,15 @@ whitespace formatting, a simple C<#> will suffice. Note that Perl closes the comment as soon as it sees a C<)>, so there is no way to put a literal C<)> in the comment. -=item C<(?pimsx-imsx)> +=item C<(?dlupimsx-imsx)> -=item C<(?^pimsx)> +=item C<(?^lupimsx)> X<(?)> X<(?^)> One or more embedded pattern-match modifiers, to be turned on (or turned off, if preceded by C<->) for the remainder of the pattern or the remainder of the enclosing pattern group (if any). -Starting in Perl 5.14, a C<"^"> (caret or circumflex accent) immediately -after the C<"?"> is a shorthand equivalent to C<-imsx> and compiling the -regex under C<no locale>. Flags may follow the caret to override it. -But a minus sign is not legal with it. - This is particularly useful for dynamic patterns, such as those read in from a configuration file, taken from an argument, or specified in a table somewhere. Consider the case where some patterns want to be case @@ -634,17 +629,53 @@ These modifiers do not carry over into named subpatterns called in the enclosing group. In other words, a pattern such as C<((?i)(&NAME))> does not change the case-sensitivity of the "NAME" pattern. -Note that the C<p> modifier is special in that it can only be enabled, -not disabled, and that its presence anywhere in a pattern has a global -effect. Thus C<(?-p)> and C<(?-p:...)> are meaningless and will warn -when executed under C<use warnings>. +Starting in Perl 5.14, a C<"^"> (caret or circumflex accent) immediately +after the C<"?"> is a shorthand equivalent to C<d-imsx>. Flags (except +C<"d">) may follow the caret to override it. +But a minus sign is not legal with it. + +Also, starting in Perl 5.14, are modifiers C<"d">, C<"l">, and C<"u">, +which for 5.14 may not be used as suffix modifiers. + +C<"l"> means to use a locale (see L<perllocale>) when pattern matching. +The locale used will be the one in effect at the time of execution of +the pattern match. This may not be the same as the compilation-time +locale, and can differ from one match to another if there is an +intervening call of the +L<setlocale() function|perllocale/The setlocale function>. +This modifier is automatically set if the regular expression is compiled +within the scope of a C<"use locale"> pragma. + +C<"u"> has no effect currently. It is automatically set if the regular +expression is compiled within the scope of a +L<C<"use feature 'unicode_strings">|feature> pragma. + +C<"d"> means to use the traditional Perl pattern matching behavior. +This is dualistic (hence the name C<"d">, which also could stand for +"default"). When this is in effect, Perl matches utf8-encoded strings +using Unicode rules, and matches non-utf8-encoded strings using the +platform's native character set rules. +See L<perlunicode/The "Unicode Bug">. It is automatically selected by +default if the regular expression is compiled neither within the scope +of a C<"use locale"> pragma nor a <C<"use feature 'unicode_strings"> +pragma. + +Note that the C<d>, C<l>, C<p>, and C<u> modifiers are special in that +they can only be enabled, not disabled, and the C<d>, C<l>, and C<u> +modifiers are mutually exclusive; a maximum of one may appear in the +construct. Specifying one de-specifies the others. Thus, for example, +C<(?-p)> and C<(?-d:...)> are meaningless and will warn when compiled +under C<use warnings>. + +Note also that the C<p> modifier is special in that its presence +anywhere in a pattern has a global effect. =item C<(?:pattern)> X<(?:)> -=item C<(?imsx-imsx:pattern)> +=item C<(?dluimsx-imsx:pattern)> -=item C<(?^imsx:pattern)> +=item C<(?^luimsx:pattern)> X<(?^:)> This is for clustering, not capturing; it groups subexpressions like @@ -660,7 +691,7 @@ but doesn't spit out extra fields. It's also cheaper not to capture characters if you don't need to. Any letters between C<?> and C<:> act as flags modifiers as with -C<(?imsx-imsx)>. For example, +C<(?dluimsx-imsx)>. For example, /(?s-i:more.*than).*million/i @@ -669,8 +700,8 @@ is equivalent to the more verbose /(?:(?s-i)more.*than).*million/i Starting in Perl 5.14, a C<"^"> (caret or circumflex accent) immediately -after the C<"?"> is a shorthand equivalent to C<-imsx> and compiling the -regex under C<no locale>. Any positive flags may follow the caret, so +after the C<"?"> is a shorthand equivalent to C<d-imsx>. Any positive +flags (except C<"d">) may follow the caret, so (?^x:foo) @@ -679,7 +710,7 @@ is equivalent to (?x-ims:foo) The caret tells Perl that this cluster doesn't inherit the flags of any -surrounding pattern, but to go back to the system defaults (C<-imsx>), +surrounding pattern, but to go back to the system defaults (C<d-imsx>), modified by any flags specified. The caret allows for simpler stringification of compiled regular |