| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This cleans up and simplifies and extends how the trie
logic interacts with the new node types. This change ultimately
makes the EXACTFU, EXACTFU_SS, EXACTFU_NO_TRIE (renamed to
EXACTFU_TRICKYFOLD) work properly with the trie engine regardless
of whether the string is utf8 or latin1.
This patch depends on the following:
EXACT => utf8 or "binary" text
EXACTFU => either pre-folded utf8, or latin1 that has to be folded as though it was utf8
EXACTFU_SS => special case of EXACTFU to handle \xDF/ss (affects latin1 treatment)
EXACTFU_TRICKYFOLD => special case of EXACTFU to handle tricky non-latin1 fold rules
EXACTF => "old style fold logic" untriable nodetype
EXACTFA => (currently) untriable nodetype
EXACTFL => (currently) untriable nodetype
See the comments in regcomp.sym for these fold types.
This patch involves a number of distinct, but related parts. Starting
from compilation:
* Simplify how we detect a triable sequence given the new nodetypes,
this also probably fixed some "bugs" in how we detected certain
sequences, like /||foo|bar/.
* Simplify how we read EXACTFU nodes under utf8 by removing the now
redundant folding logic (EXACTFU nodes under utf8 are prefolded).
Also extend this logic to handle latin1 patterns properly (in
conjunction with other changes)
* Part of the problems associated with EXACTFU_SS and EXACTFU_TRICKYFOLD
have to do with how the trie logic interacts with the minlen logic.
This change handles both by pessimising the minlen when encounting
these nodetypes. One observation is that the minlen logic is basically
broken, and works only because it conflates bytes and codepoints in
such a way that we more or less always get a value small enough that things work out
anyway. Fixing that is properly is the job of another patch.
* Part of the problem of doing folding under unicode rules is that
there are a lot of foldings possible, some with strange rules. This
means that the bitmap logic does not work correctly in all cases,
as we currently do not have any way to populate it properly.
So this patch disables the bitmap entirely when folding is involved
until that is fixed.
The end result of this is: we can TRIE/AHOCORASICK any sequence of
EXACT, or EXACTFU (ish) nodes, regardless of utf8 or not, but we disable
the bitmap when folding.
A note for follow up relating to this patch is that the way EXACTFU_XXX
nodes are currently dealt with we wont build the "maximal" trie because
of their presence, instead creating a "jumptrie" consisting of either a
leading EXACTFU node followed by a EXACTFU_XXX node, or vice versa. We
should eventually address that.
|
|
|
|
|
|
|
| |
The old output would show only the line number as diagnostics
but not the test number, nor the test name, which often contains
very useful information. This patch makes sure this is visible in
the diagnostics output of test failures.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This provides enough rope for those who want to hang themselves, and
also for those who know how to use the rope without hanging them-
selves. :-)
Since this is not generally a reliable thing to be doing, a warning is emitted whenever :lvalue is turned on or off on a defined subroutine.
But attributes.pm will flip the flag anyway. :lvalue in a sub declar-
ation still refuses to modify a defined Perl sub, as before.
|
| |
|
| |
|
| |
|
|
|
|
| |
Adapted from suggestion by David Golden++. For RT #37033.
|
| |
|
|
|
|
|
|
|
|
| |
We need to unixify the current working directory since we're going
to be comparing to the pod root that has been unixified internally
in Pod::Html.
Also clean up all versions of the generated files.
|
|
|
|
|
|
|
|
|
|
| |
This is mostly borrowed from CPANPLUS with additional tweaks to
handle corner cases presented by the Pod::Html tests. It seems
to work on VMS, Windows, and Mac OS X.
Also tweak _save_page to make the call to ab2rel more robust in
the case wherethe base is a special string indicating the current
working directory ('./', '[]', or '.\') rather than a literal path.
|
|
|
|
|
| |
Windows has FC (file compare), VMS has DIFFERENCES, and Linux is
certainly not the only OS that can do unified diff.
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
This commit looks for the passed-in charset, and overrides it only if it
is /d and the pattern requires /u. Previously the passed-in value was
ignored.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was an off-by-one error caused by my failing to realize that things
had to be done differently at the 255/256 boundary depending on whether
U+00FF matched or did not match the property.
Two properties were affected, [:upper:] and [:punct:]. The bug was that
all code points above the first one > 255 that legitimately matches the
property will match whether or not they should. In the case of
[:upper:], this meant that effectively anything from 256..infinity
matched. For [:punct:], it was anything above U+037D.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
make the code slightly smaller by changing
if (A)
return X;
if (B)
return X;
into
`
if (A || B)
return X;
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Previously it would leave the file handle open if it was (equal to) stdin,
on the assumption that this must have been because no script name was
supplied on the interpreter command line, so the interpreter was defaulting
to reading the script from standard input.
However, if the program has closed STDIN, then the next file handle opened
(for any reason) will have file descriptor 0. So in this situation, the
handle that require opened to read the module would be mistaken for the above
situation and left open. Effectively, this leaked a file handle.
This is now fixed, by explicitly tracking from parser creation time whether
it should keep the file handle open, and only setting this flag when
defaulting to reading the main program from standard input. This resolves
RT #37033.
|
| |
| |
| |
| |
| |
| | |
Now that the logic for stdin is implemented as an early return of NULL from
S_open_script(), in all cases that reach the end of S_open_script(), rsfp
is non-NULL, and a file handle that we wish to set to close on exec.
|
| |
| |
| |
| |
| | |
Move the logic to assign PerlIO_stdin() to rsfp from S_open_script() to its
only caller, S_parse_body().
|
| |
| |
| |
| |
| |
| | |
Previously it was being passed &rsfp as a parameter, because it was
returning another value, fdscript. However, the return value has been
ignored since commit cc69b689ee7c2745 removed suidperl in January 2009.
|
|/
|
|
|
|
|
|
|
|
| |
lex_flags holds 4 flag bits, with multiple flag bits manipulated together
at times, so they can't be split out into individual bitfields. This change
permits the C compiler to generate simpler code, reducing toke.o by about
400 bytes on this platform, but doesn't change the size of the structure.
lex_flags was added in commit 802a15e9c01d1a0b in August 2011, so is not in
any stable release.
|
|
|
|
|
| |
Test the error message generated when -x can't find a "#!perl" line.
Test that this error message still appears when -x is used with -e.
|
|
|
|
|
|
| |
Verify that -p actually runs the code in the program body.
Verify that -n doesn't implicitly print out the contents of $_.
For both, verify that an END block runs after the implicit loop.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Don't assign to two lexical variables, $Is_VMS and $Is_Win32, only to use
them immediately for the same purpose - to skip the entire test.
In turn, there's no need to conditionally set $quote to a value suitable for
VMS or Win32, when neither OS ever runs the test.
The code has been this way since the file was added by commit
742218b34f58f961 in Nov 2006. Hence I don't think that the vestigial $quote
logic corresponds to pre-commit version that did run on these platforms.
Instead I infer that it has come from t/op/exec.t, used as a template for
running sub-scripts in a portable fashion.
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
It seems that many people have trouble understanding how to add custom
attributes to their subroutines. Here's a doc patch that will
hopefully make things clearer:
|
|
|
|
|
|
|
|
|
| |
It was set to undef, which meant it hadn’t been discussed. In actuality,
it is actively maintained on CPAN. See, for instance:
https://rt.cpan.org/Ticket/Display.html?id=75077#txn-1038945
So this brings Maintainers.pl closer in line with reality.
|
|
|
|
|
|
|
|
|
| |
‘Normalise’ in this case means to set $^H to indicate that features
are in %^H (FEATURE_BUNDLE_CUSTOM) and to make %^H contain the current
feature set.
Since ‘no feature’ sets the default feature bundle in $^H, this is
unnecessary in that case.
|
| |
|
|
|
|
|
| |
It is not features not in the current version, but those not in the
requested version, that are disabled.
|
| |
|
| |
|
|
|
|
|
|
| |
Reading $$ in a tainted expression was tainting the internal sv_setiv()
on $$. Since the value being set came directly from getpid(), it's
always safe, so override the tainting there. Fixes [perl #109688].
|
|
|
|
|
|
| |
The format of the individual function HTML files has changed. The index
generator needs to update to successfully extract the NAME sections.
Fixes [perl #107870].
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, a table was being allocated for OP_TRANS(|R), in a
PVOP arrangement, as soon as the op was built. However, it wasn't
used immediately, and for UTF8-flagged ops it would be thrown away,
replaced by an SV-based translation table in a SVOP or PADOP arrangement.
This mutation of the op structure occurred in pmtrans(), some time after
original op building. If an error occurred before pmtrans(), requiring
the op to be freed, op_clear() would be misled by the UTF8 flags into
treating the PV as an SV or pad index, causing crashes in the latter
case [perl #102858]. op_clear() was implicitly assuming that pmtrans()
had been performed, due to lacking any explicit indication of the op's
state of construction.
Now, the PV table is allocated by pmtrans(), when it's actually going to
populate it. The PV doesn't get allocated at all for UTF8-flagged ops.
Prior to pmtrans(), the op_pv/op_sv/op_padix field is all bits zero,
so there's no problem with freeing the op.
|
| |
|
|
|
|
|
|
| |
Unicode 6.1 erroneously omitted Takri as a script that uses two
characters, and have voted to publish the correction that this patch
makes. There isn't an official Corrigendum yet.
|
| |
|