diff options
author | Karl Williamson <khw@cpan.org> | 2015-01-15 20:03:09 -0700 |
---|---|---|
committer | Karl Williamson <khw@cpan.org> | 2015-01-16 13:17:23 -0700 |
commit | 7475a24fb1bcc3d031d46c3a83616671479634f5 (patch) | |
tree | a5db71195de4764549e45f1337bc0a67c1329f66 /regcomp.c | |
parent | 1cc8089a1939c02111513be3f1f49631ccb84757 (diff) | |
download | perl-7475a24fb1bcc3d031d46c3a83616671479634f5.tar.gz |
regcomp.c: Fix bug in /[A-Z]/i
This also fixes /[a-z]/i.
When not under /i, these two ranges alone in a bracketed character class
can be optimized into qr/[[:upper:]]/a and qr/[[:lower:]]/a respectively.
This optimization saves space in the pattern (as no bitmap is needed),
and I think it executes faster. But this optimization has to be
foregone under /i (unless /a is also present) because otherwise
certain non-ASCII characters such as the \N{KELVIN SIGN} don't match,
and they should.
Diffstat (limited to 'regcomp.c')
-rw-r--r-- | regcomp.c | 7 |
1 files changed, 6 insertions, 1 deletions
@@ -14895,7 +14895,11 @@ S_regclass(pTHX_ RExC_state_t *pRExC_state, I32 *flagp, U32 depth, op = POSIXA; } } - else if (prevvalue == 'A') { + else if (AT_LEAST_ASCII_RESTRICTED || ! FOLD) { + /* We can optimize A-Z or a-z, but not if they could match + * something like the KELVIN SIGN under /i (/a means they + * can't) */ + if (prevvalue == 'A') { if (value == 'Z' #ifdef EBCDIC && literal_endpoint == 2 @@ -14915,6 +14919,7 @@ S_regclass(pTHX_ RExC_state_t *pRExC_state, I32 *flagp, U32 depth, op = POSIXA; } } + } } /* Here, we have changed <op> away from its initial value iff we found |