summaryrefslogtreecommitdiff
path: root/pp_sys.c
diff options
context:
space:
mode:
authorFather Chrysostomos <sprout@cpan.org>2011-10-25 15:40:40 -0700
committerFather Chrysostomos <sprout@cpan.org>2011-10-26 18:22:18 -0700
commit1bb8785ab1af03172a3a220f8948d33bdc3dd374 (patch)
treea14c14464b4aa65c51fe30d11a6b79bac2bb3313 /pp_sys.c
parentedfed4c3099abba2a8b83e8dde6bcff0952c07f5 (diff)
downloadperl-1bb8785ab1af03172a3a220f8948d33bdc3dd374.tar.gz
Rewrite csh_glob in C; fix two quoting bugs
This commit rewrites File::Glob::csh_glob (which implements perl’s default globbing behaviour) in C. This fixes a problem introduced by 0b0e6d70f. If there is an unmatched quotation mark, all attempts to parse the pattern are discarded and it is treated as a single token. Prior to 0b0e6d70f, whitespace was stripped from both ends in that case. As of 0b0e6d70f, it was only stripped from the beginning. This commit restores the pre-0b0e6d70f behaviour with unmatched quotes. It doesn’t take 'a"b\ ' into account (where the space is escaped), but that wasn’t handled properly before 0b0e6d70f, either. This also finishes making csh_glob consistent with regard to quota- tion marks. Commit 0b0e6d70f attempted to do that, but did not strip out medial quotation marks, as in a"b"c. Text::ParseWords does not provide an interface for stripping out quotation marks but leaving backslashes, which I tried to work around, not fully understanding the implications. Anyway, this new C implementation doesn’t use Text::ParseWords. The latter fix caused a test failure, but that test was there to make sure the behaviour didn’t change depending on whether File::Glob was loaded before the first mention of glob(). (In 5.6, loading File::Glob first would make perl revert to external csh glob, ironic- ally enough.) This commit modifies the test to test for sameness, rather than exact output. In fact, this change causes perl and miniperl to be consistent, and probably also causes glob to be more consistent across platforms (think of VMS). Another effect of the translation to C is that the Unicode Bug is fixed with regard to splitting patterns. The C code effectively does /\s/a now (which I believe is the only sane behaviour in this case), instead of treating the string differently depending on the UTF8 flag. The Unicode Bug is still present with regard to actual globbing. This commit introduces one regression. This code: undef %File::Glob::; glob("nometachars"); will no longer return anything, because csh_glob no longer holds a reference count on the $File::Glob::DEFAULT_FLAGS glob. Any code that does that is beyond crazy. The big advantage to this patch is speed. Something like ‘@files = <*>’ is 18% faster in a folder of 300 files. For smaller folders there should be an even more notable difference.
Diffstat (limited to 'pp_sys.c')
0 files changed, 0 insertions, 0 deletions