diff options
author | Paolo Bonzini <bonzini@gnu.org> | 2009-11-17 16:23:24 -0800 |
---|---|---|
committer | Ulrich Drepper <drepper@redhat.com> | 2009-11-17 16:23:24 -0800 |
commit | 815d8147a3418334ffa91e2384c6e159f0809d65 (patch) | |
tree | 164ba2a49c0a9287af8894c6c12c12e3d5b33dc8 /posix/regcomp.c | |
parent | 7443244740724babd575943ee33c45da326afbe7 (diff) | |
download | glibc-815d8147a3418334ffa91e2384c6e159f0809d65.tar.gz |
Fix ranges with multibyte characters as endpoints.
This is another bug in computing the fastmap. It was reported by a user
of sed because it usually does not happen with !_LIBC. However, it is
there in that case too.
The bug is that whenever we have a range at the beginning of the regex,
the regex must be tested on any possible multibyte character. The reason
why _LIBC masks it, is that in general there is a collation symbol for
each possible multibyte-character lead byte, so all the lead bytes are
in general already part of the fastmap.
The tests use cyrillic characters as an example. With _LIBC, they pass
without the patch too, but you can make them fail by removing collation
symbols handling.
Diffstat (limited to 'posix/regcomp.c')
-rw-r--r-- | posix/regcomp.c | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/posix/regcomp.c b/posix/regcomp.c index 446fed5445..6966b5da3c 100644 --- a/posix/regcomp.c +++ b/posix/regcomp.c @@ -377,7 +377,7 @@ re_compile_fastmap_iter (regex_t *bufp, const re_dfastate_t *init_state, applies to multibyte character sets; for single byte character sets, the SIMPLE_BRACKET again suffices. */ if (dfa->mb_cur_max > 1 - && (cset->nchar_classes || cset->non_match + && (cset->nchar_classes || cset->non_match || cset->nranges # ifdef _LIBC || cset->nequiv_classes # endif /* _LIBC */ |