| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
(includes regen/opcode.pl)
|
|
|
|
| |
(includes regen/opcode.pl)
|
|
|
|
|
|
| |
There is a bug in Solaris with locales which have a multi-byte decimal
radix character. Make these TODO, like we do cygwin, which has had a
similar problem.
|
|
|
|
|
|
|
| |
We now have found a system that fails this test. Tests that are listed
as problematic automatically get marked as TODO when they fail with
specified platforms. The next commit will specify the platform that
this is fails on.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
That ticket points out that subs.pm mentions using declared sub names
"without parentheses". That is indeed misleading; using the term "list
operators" would be better.
While changing that, I've also tweaked the wording about lexical scopes and
the inability to rescind these declarations, and ensured that the vars.pm
has the same revised text.
|
|
|
|
|
|
| |
Make sure that a non-binary property doesn't get mistakenly matched in
\p{}, which is only for binary ones. There are some ambiguities that
this test keeps us from falling victim to.
|
|
|
|
|
|
| |
It's not needed.
RT #125619
|
|
|
|
|
|
| |
This property parallels the Age property (but is cumulative). Each
table in it should have the same property value possibilities as the
corresponding Age table.
|
|
|
|
|
|
|
|
|
| |
The following changes were still required after doing
$ ./perl -Ilib Porting/bump-perl-version -i 5.27.10 5.27.11
Module::CoreList had to be updated by hand.
Op_private.pm had to be updated by doing regen/opcode.pl lib/B/Op_private.pm
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(\my @a)->$#*
was being deparsed as
$#{\my @a}
which incorrectly reduced the scope of the lexical @a.
Make Deparse.pm recognise this case and use the new postfix notation
instead.
This fixes
./TEST -deparse op/array.t
which was using ->$#*.
|
|
|
|
|
|
|
|
| |
\our @a
was being deparsed as
\our(@a)
which incorrectly converts the \ from a scalar op to a list op
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A sub declaration like
sub f { ...}
will be deparsed under -l as
#line 1 "foo"
sub f {
#line 1 "foo"
....
}
which means that the closing '}' of the sub is incorrectly seen as being
on the next line.
This matters when a glob is created based on a sub being compiled: the
glob's gp_line field is set to the line containing the '}'.
This was causing some tests in ext/B/t/xref.t to fail under
./TEST -deparse
This commit causes an extra #line directive to be emitted:
#line 1 "foo"
sub f {
#line 1 "foo"
....
#line 1 "foo" <=== NEW
}
Whether xref.t failed depended on another factor. The optimisation
which created the GV for a just-compiled sub as an RV to a CV rather than
a GV to CV, causes the later-vifified GV (upgraded from an RV) to instead
have a gp_line corresponding to the first cop in the sub's body.
Arguably this difference (gp_line being set to the line number of first
line of the sub rather than the last line) is a bug, but it's not obvious
how to fix it, and I don't address it here.
However, the optimisation originally only applied to GVs in the main
stash; it was later extended to all stashes but then reverted again,
in the sequence of commits
v5.27.4-66-g6881372
v5.27.4-127-g6eed25e
v5.27.5-321-gd964025
v5.27.8-149-g1e2cfe1
which caused xref.t to pass, then fail again.
|
|
|
|
|
|
|
|
| |
Deparsing a tr/....//c (complement and an empty replacement list)
was failing due to the 'delete RHS if LHS == RHS' action not triggering,
due the LHS having already been complemented.
At the same time, expand the set deparse tr/// tests;
|
| |
|
|
|
|
|
| |
I accidentally left a debugging print statement in after my recent tr///
work
|
|
|
|
|
|
|
|
| |
I benchmarked this line alone in a one-liner and found the sprintf
variant to be roughly 10% faster than the concat version. It’s also
more readable (and maintainable).
(Not significant enough to warrant a perldelta entry.)
|
|
|
|
|
|
|
|
| |
the lexical my $file inside the loop masked the for loop $file,
wasting the work done to canonicalize the path names.
The grep on length is required since splitdir() can return empty
strings.
|
|
|
|
|
| |
These were made using the old version number, and the error was not
caught, because of the lateness in getting 5.27.9 tagged
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The _at_level functions, which have to bypass Carp, were not
reporting non-line-based filehandles correctly. The perl core
does:
..., <fh> chunk 7.
if $/ is not "\n". warnings.pm should do the same. It was using
‘line’.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RT #132793, RT #132801
In something like $x .= "$overloaded", the $overloaded stringify method
wasn't being called.
However, it turns that the existing (pre-multiconcat) behaviour is also
buggy and inconsistent. That behaviour has been restored as-is.
At some future time, these bugs might be addressed.
Here are some comments from the new tests added to overload.t:
Since 5.000, any OP_STRINGIFY immediately following an OP_CONCAT
is optimised away, on the assumption that since concat will always
return a valid string anyway, it doesn't need stringifying.
So in "$x", the stringify is needed, but on "$x$y" it isn't.
This assumption is flawed once overloading has been introduced, since
concat might return an overloaded object which still needs stringifying.
However, this flawed behaviour is apparently needed by at least one
module, and is tested for in opbasic/concat.t: see RT #124160.
There is also a wart with the OPpTARGET_MY optimisation: specifically,
in $lex = "...", if $lex is a lexical var, then a chain of 2 or more
concats *doesn't* optimise away OP_STRINGIFY:
$lex = "$x"; # stringifies
$lex = "$x$y"; # doesn't stringify
$lex = "$x$y$z..."; # stringifies
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The way pp_multiconcat handles things like tieing and overloading
doesn't work very well at the moment. There's a lot of code to handle
edge cases, and there are still open bugs.
The basic algorithm in pp_multiconcat is to first stringify (i.e. call
SvPV() on) *all* args, then use the obtained values to calculate the total
length and utf8ness required, then do a single SvGROW and copy all the
bytes from all the args.
This ordering is wrong when variables with visible side effects, such as
tie/overload, are encountered. The current approach is to stringify args
up until such an arg is encountered, concat all args up until that one
together via the normal fast route, then jump to a special block of code
which concats any remaining args one by one the "hard" way, handling
overload etc.
This is problematic because we sometimes need to go back in time. For
example in ($undef . $overloaded), we're supposed to call
$overloaded->concat($undef, reverse=1)
so to speak, but by the time of the method call, we've already tried to
stringify $undef and emitted a spurious 'uninit var' warning.
The new approach taken in this commit is to:
1) Bail out of the stringify loop under a greater range of problematical
variable classes - namely we stop when encountering *anything* which
might cause external effects, so in addition to tied and overloaded vars,
we now stop for any sort of get magic, or any undefined value where
warnings are in scope.
2) If we bail out, we throw away any stringification results so far,
and concatenate *all* args the slow way, even ones we're already
stringified. This solves the "going back in time" problem mentioned above.
It's safe because the only vars that get processed twice are ones for which
the first stringification could have no side effects.
The slow concat loop now uses S_do_concat(), which is a new static inline
function which implements the main body of pp_concat() - so they share
identical code.
An intentional side-effect of this commit is to fix three tickets:
RT #132783
RT #132827
RT #132595
so tests for them are included in this commit.
One effect of this commit is that string concatenation of magic or
undefined vars will now be slower than before, e.g.
"pid=$$"
"value=$undef"
but they will probably still be faster than before pp_multiconcat was
introduced.
|
|
|
|
| |
This will be used in future commits.
|
| |
|
|
|
|
| |
since warnings.(pm|pl) was updated in 25ebbc2270
|
|
|
|
|
|
|
| |
otherwise the new _at_level functions end up including
‘at <handle> line 0’.
[Commit message by the committer.]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some tests I recently added had A-Z in the replacement charlist, which
under EBCDIC gets deparsed as A-IJ-RS-Z, so original and deparsed don't
match.
Ideally the deparsing could be smart enough to coalesce those ranges,
but for now I've just changed the range to A-I which deparses ok on both
ASCII and EBCDIC.
The point of the test is for when there are more replacement chars than
search chars, and in this case A-I works just as well as A-Z.
Spotted by Karl.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RT #132141
Attributes such as :lvalue have to come *before* the signature to ensure
that they're applied to any code block within the signature; e.g.
sub f :lvalue ($a = do { $x = "abc"; return substr($x,0,1)}) {
....
}
So this commit moves sub attributes to come before the signature. This is
how they were originally, but they were swapped with v5.21.7-394-gabcf453.
This commit is essentially a revert of that commit (and its followups
v5.21.7-395-g71917f6, v5.21.7-421-g63ccd0d), plus some extra work for
Deparse, and an extra test.
See:
RT #123069 for why they were originally swapped
RT #132141 for why that broke :lvalue
http://nntp.perl.org/group/perl.perl5.porters/247999
for a general discussion about RT #132141
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The run-time code to handle a non-utf8 tr/// against a utf8 string
is complex, with many variants of similar code repeated depending on the
presence of the /s and /c flags.
Simplify them all into a single code block by changing how the translation
table is stored. Formerly, the tr struct contained possibly two tables:
the basic 0-255 slot one, plus in the presence of /c, a second one
to map the implicit search range (\x{100}...) against any residual
replacement chars not consumed by the first table.
This commit merges the two tables into a single unified whole. For example
tr/\x00-\xfe/abcd/c
is equivalent to
tr/xff-\x{7fffffff}/abcd/
which generates a 259-entry translation table consisting of:
0x00 => -1
0x01 => -1
...
0xfe => -1
0xff => a
0x100 => b
0x101 => c
0x102 => d
In addition we store:
1) the size of the translation table (0x103 in the example above);
2) an extra 'wildcard' entry stored 1 slot beyond the main table,
which specifies the action for any codepoints outside the range of
the table (i.e. chars 0x103..0x7fffffff). This can be either:
a) a character, when the last replacement char is repeated;
b) -1 when /c isn't in effect;
c) -2 when /d is in effect;
c) -3 identity: when the replacement list is empty but not /d.
In the example above, this would be
0x103 => d
The addition of -3 as a valid slot value is new.
This makes the main runtime code for the utf8 string with non-utf8 tr//
case look like, at its core:
size = tbl->size;
mapped_ch = tbl->map[ch >= size ? size : ch];
which then processes mapped_ch based on whether its >=0, or -1/-2/-3.
This is a lot simpler than the old scheme, and should generally be faster
too.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RT #132608
In the non-utf8 case, the /c (complement) flag to tr adds an implied
\x{100}-\x{7fffffff} range to the search charlist. If the replacement list
contains more chars than are paired with the 0-255 part of the search
list, then the excess chars are stored in an extended part of the table.
The excess char count was being stored as a short, which caused problems
if the replacement list contained more than 32767 excess chars: either
substituting the wrong char, or substituting for a char located up to
0xffff bytes in memory before the real translation table.
So change it to SSize_t.
Note that this is only a problem when the search and replacement charlists
are non-utf8, the replacement list contains around 0x8000+ entries, and
where the string being translated is utf8 with at least one codepoint >=
U+8000.
|
|
|
|
|
|
|
|
| |
Recent commits slightly changed the layout of the extended map table: it
now always stores a repeat count, and there are now two structs defined,
rather than treating certain slots, like tbl[0x101], specially.
Update B and Deparse to reflect this.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The pumpking has determined that the CPAN breakage caused by changing
smartmatch [perl #132594] is too great for the smartmatch changes to
stay in for 5.28.
This reverts most of the merge in commit
da4e040f42421764ef069371d77c008e6b801f45. All core behaviour and
documentation is reverted. The removal of use of smartmatch from a couple
of tests (that aren't testing smartmatch) remains. Customisation of
a couple of CPAN modules to make them portable across smartmatch types
remains. A small bugfix in scope.c also remains.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Get rid of the file-global filehandles and the unused filename return
value, instead return the filehandle and assign it to a lexical
variable. Also don't bother checking the return value; it croaks on
failure anyway.
In passing, eliminate erroneous assignment of {} to %CASESPEC for
Unicode < 2.1.8.
|
| |
|
|
|
|
|
|
| |
Commit 50a85cfe6c852deb0c2f738cb82006623052dc8e clarified that this
module uses Perl's extended UTF-8, but missed the mention fixed in this
commit.
|
|
|
|
|
|
|
| |
As discussed in http://nntp.perl.org/group/perl.perl5.porters/244444,
this sets the optional scalar ref paramater to the length of the valid
initial portion of the first parameter passed to num(). This is useful
in teasing apart why the input is invalid.
|
|
|
|
| |
This will be used in later commits.
|
| |
|
|
|
|
| |
This allows charprop() to be called on a Perl-internal-only property
|
|
|
|
|
| |
Some early Unicode releases used a hyphen instead of an underscore in
script names. This changes all into underscores
|
|
|
|
| |
Spotted by Christian Hansen
|
| |
|
|
|
|
|
| |
The value for this variable is already known; use that instead of
rederiving it.
|