diff options
author | Jarkko Hietaniemi <jhi@iki.fi> | 2001-10-18 00:10:44 +0000 |
---|---|---|
committer | Jarkko Hietaniemi <jhi@iki.fi> | 2001-10-18 00:10:44 +0000 |
commit | 8305e449a259649641f455b333f66bc0de7f3b62 (patch) | |
tree | 9df8d155049543c47d062096b9b045e573efe0ab /pod/perlfaq6.pod | |
parent | 91487cfc840e1faf4dbb6a4f7eb906993cbed22f (diff) | |
download | perl-8305e449a259649641f455b333f66bc0de7f3b62.tar.gz |
FAQ sync.
p4raw-id: //depot/perl@12486
Diffstat (limited to 'pod/perlfaq6.pod')
-rw-r--r-- | pod/perlfaq6.pod | 30 |
1 files changed, 20 insertions, 10 deletions
diff --git a/pod/perlfaq6.pod b/pod/perlfaq6.pod index 1cad81559a..4a52259800 100644 --- a/pod/perlfaq6.pod +++ b/pod/perlfaq6.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq6 - Regexes ($Revision: 1.1 $, $Date: 2001/09/20 03:03:00 $) +perlfaq6 - Regexes ($Revision: 1.3 $, $Date: 2001/10/16 13:27:22 $) =head1 DESCRIPTION @@ -194,7 +194,7 @@ properties of bitwise xor on ASCII strings. print; -And here it is as a subroutine, modelled after the above: +And here it is as a subroutine, modeled after the above: sub preserve_case($$) { my ($old, $new) = @_; @@ -383,20 +383,30 @@ A slight modification also removes C++ comments: =head2 Can I use Perl regular expressions to match balanced text? -Although Perl regular expressions are more powerful than "mathematical" -regular expressions because they feature conveniences like backreferences -(C<\1> and its ilk), they still aren't powerful enough--with -the possible exception of bizarre and experimental features in the -development-track releases of Perl. You still need to use non-regex -techniques to parse balanced text, such as the text enclosed between -matching parentheses or braces, for example. +Historically, Perl regular expressions were not capable of matching +balanced text. As of more recent versions of perl including 5.6.1 +experimental features have been added that make it possible to do this. +Look at the documentation for the (??{ }) construct in recent perlre manual +pages to see an example of matching balanced parentheses. Be sure to take +special notice of the warnings present in the manual before making use +of this feature. + +CPAN contains many modules that can be useful for matching text +depending on the context. Damian Conway provides some useful +patterns in Regexp::Common. The module Text::Balanced provides a +general solution to this problem. + +One of the common applications of balanced text matching is working +with XML and HTML. There are many modules available that support +these needs. Two examples are HTML::Parser and XML::Parser. There +are many others. An elaborate subroutine (for 7-bit ASCII only) to pull out balanced and possibly nested single chars, like C<`> and C<'>, C<{> and C<}>, or C<(> and C<)> can be found in http://www.perl.com/CPAN/authors/id/TOMC/scripts/pull_quotes.gz . -The C::Scan module from CPAN contains such subs for internal use, +The C::Scan module from CPAN also contains such subs for internal use, but they are undocumented. =head2 What does it mean that regexes are greedy? How can I get around it? |