diff options
author | Father Chrysostomos <sprout@cpan.org> | 2012-11-02 05:48:34 -0700 |
---|---|---|
committer | Father Chrysostomos <sprout@cpan.org> | 2012-11-02 05:57:30 -0700 |
commit | 583a5589b0727e1fccd2f9f7f0c8cb50c04884d5 (patch) | |
tree | 562d9fcb84c7a01e929ac2b3e3499bfda155fba6 /pp_hot.c | |
parent | 644ac3a877e810ca7c69833b568049c1c2665ce9 (diff) | |
download | perl-583a5589b0727e1fccd2f9f7f0c8cb50c04884d5.tar.gz |
Fix $byte_overload .= $utf8 regression
This is a regression from 5.12.
This was probably broken by commit c5aa287237.
#!perl -lCS
{ package o; use overload '""' => sub { $_[0][0] } }
$x = bless[chr 256],o::;
"$x";
$x->[0] = "\xff";
$x.= chr 257;
$x.= chr 257;
use Devel::Peek;
Dump $x;
print $x;
__END__
Output under 5.12.4:
SV = PVIV(0x820604) at 0x825820
REFCNT = 1
FLAGS = (POK,pPOK,UTF8)
IV = 0
PV = 0x2139d0 "\303\277\304\201\304\201"\0 [UTF8 "\x{ff}\x{101}\x{101}"]
CUR = 6
LEN = 16
ÿāā
Output under 5.14.0:
SV = PVIV(0x820604) at 0x826490
REFCNT = 1
FLAGS = (POK,pPOK,UTF8)
IV = 0
PV = 0x316230 "\303\277\303\204\302\201\304\201"\0 [UTF8 "\x{ff}\x{c4}\x{81}\x{101}"]
CUR = 8
LEN = 16
ÿÄā
The UTF8 flag is only meaningful right after stringification.
If the $byte_overload scalar happens to have the flag on from last
time, but string overloading will turn the flag off, then pp_concat
gets confused as to whether it is dealing with bytes or utf8. It
sees both sides as having the same utf8ness, so it concatenates,
which stringifies the lhs and turns off the flag. The utf8 sequences
appended end up with no utf8 flag associated with them, the observable
effect being that the rhs is encoded as utf8.
If it weren’t for encoding.pm, we could use sv_catpvn_nomg_maybeutf8
and avoid determining the utf8ness of the lhs beforehand. But see-
ing that encoding.pm still exists, we have to prevent double overload
stringification the other way, by force-stringification of the target.
Diffstat (limited to 'pp_hot.c')
-rw-r--r-- | pp_hot.c | 4 |
1 files changed, 2 insertions, 2 deletions
@@ -273,8 +273,8 @@ PP(pp_concat) report_uninit(right); sv_setpvs(left, ""); } - lbyte = (SvROK(left) && SvTYPE(SvRV(left)) == SVt_REGEXP) - ? !DO_UTF8(SvRV(left)) : !DO_UTF8(left); + SvPV_force_nomg_nolen(left); + lbyte = !DO_UTF8(left); if (IN_BYTES) SvUTF8_off(TARG); } |