delta/perl.git - github.com: perl/perl5.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	perlapi: Remove extraneous ">"	Karl Williamson	2015-05-12	1	-2/+2
\|
*	perlapi: Use UVCHR_SKIP not UNI_SKIP	Karl Williamson	2015-05-11	1	-2/+2
\| \| \| \|	This new name is more consistent with other uses in the API.
*	perlapi: Add 2 links to other parts of the pod	Karl Williamson	2015-05-08	1	-0/+2
\|
*	Revert "Don’t call save_re_context"	David Mitchell	2015-03-30	1	-0/+5
\| \| \| \| \| \|	This reverts commit d28a9254e445aee7212523d9a7ff62ae0a743fec. Turns out we need save_re_context() after all
*	Replace common Emacs file-local variables with dir-locals	Dagfinn Ilmari Mannsåker	2015-03-22	1	-6/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An empty cpan/.dir-locals.el stops Emacs using the core defaults for code imported from CPAN. Committer's work: To keep t/porting/cmp_version.t and t/porting/utils.t happy, $VERSION needed to be incremented in many files, including throughout dist/PathTools. perldelta entry for module updates. Add two Emacs control files to MANIFEST; re-sort MANIFEST. For: RT #124119.
*	[perl #123814] replace grok_atou with grok_atoUV	Hugo van der Sanden	2015-03-09	1	-3/+9
\| \| \| \| \| \| \| \| \| \| \| \|	Some questions and loose ends: XXX gv.c:S_gv_magicalize - why are we using SSize_t for paren? XXX mg.c:Perl_magic_set - need appopriate error handling for $) XXX regcomp.c:S_reg - need to check if we do the right thing if parno was not grokked Perl_get_debug_opts should probably return something unsigned; not sure if that's something we can change.
*	Consistently use NOT_REACHED; /* NOTREACHED */	Jarkko Hietaniemi	2015-03-04	1	-1/+1
\| \| \| \| \| \|	Both needed: the macro is for compilers, the comment for static checkers. (This doesn't address whether each spot is correct and necessary.)
*	Add qr/\b{gcb}/	Karl Williamson	2015-02-19	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \|	A function implements seeing if the space between any two characters is a grapheme cluster break. Afer I wrote this, I realized that an array lookup might be a better implementation, but the deadline for v5.22 was too close to change it. I did see that my gcc optimized it down to an array lookup. This makes the implementation of \X go from being complicated to trivial.
*	utf8.c: Slight refactor of UTF-16 code	Karl Williamson	2015-02-18	1	-8/+15
\| \| \| \| \| \|	This eliminates a branch in the usual case, at the expense of an extra one in the rarer case, which allows us to collapse some error condition code. It sprinkles some UNLIKELYs.
*	move functions marked as mathomed in embed.fnc to mathoms.c	Daniel Dragan	2015-01-27	1	-16/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ever since commit 075eb5c9b6 mathom functions must be in mathoms.c for their symbols to be skipped in makedef.pl on Win32 Perl. If a function is marked 'b' in embed.fnc, regen.pl does NOT add its prototype to proto.h (it is commented out). Without the proto.h entry, EXTERN_C will be missing and a -DNO_MATHOMS + C++ Win32 Perl build will not link, since the C function will have a mangled name and the symbol will not be found for creating the perl linking library. Also add EXTERN_C to Win32CORE, the init_Win32CORE symbol is special cased for exporting in makedef.pl. Perl_is_utf8_char_buf was marked as 'b' in commit 3cedd9d930 Perl_sv_copypv was marked as 'b' in commit 4bac9ae47b
*	avoid C labels in column 0	David Mitchell	2015-01-21	1	-4/+4
\| \| \| \| \| \| \| \| \|	Generally the guideline is to outdent C labels (e.g. 'foo:') 2 columns from the surrounding code. If the label starts at column zero, then it means that diffs, such as those generated by git, display the label rather than the function name at the head of a diff block: which makes diffs harder to peruse.
*	Raise warning on multi-byte char in single-byte locale	Karl Williamson	2014-12-29	1	-1/+2
\| \| \| \| \| \| \| \| \|	See http://nntp.perl.org/group/perl.perl5.porters/211909 Something is quite likely wrong with the logic if say in a Greek locale, Unicode characters (especially Greek ones) are encountered. The same character will be represented by two different code points. This warning alerts the user to this undesirable state of affairs.
*	foldEQ_utf8(): Add some internal flags	Karl Williamson	2014-12-29	1	-1/+12
\| \| \| \|	The comments explain their purpose
*	Simplify foldEQ_utf8	Karl Williamson	2014-12-29	1	-80/+45
\| \| \| \| \| \| \| \| \| \| \| \|	This moves the uncommon case of handling inputs under non-UTF-8 locales out of this function to the functions it calls, which already have the logic to handle it. This simplifies this function, cutting a couple branches each time through the loop from the common usage. The locale handling is slowed down somewhat, but even if that were a concern, another simpler function is normally used for locale handling. This gets called only when one or both of the comparison strings is UTF-8, which should be comparatively rare for non-UTF8 locales.
*	utf8.c: Use OP_DESC instead of passing string.	Karl Williamson	2014-12-29	1	-6/+6
\| \| \| \|	OP_DESC is simpler and more general.
*	utf8.c: Fix potential fold bug	Karl Williamson	2014-12-29	1	-6/+4
\| \| \| \| \| \| \| \| \|	The function _to_uni_fold_flags() supposedly had the ability to do folding based on the current locale, if the correct flag is passed. However, it didn't actually do that, returning a non-locale fold instead. Fortunately, this is an undocumented capability (actually, the whole function is undocumented), and no current calls to it used the flag. This commit causes it to work.
*	utf8.c: Add some function parameter assertions	Karl Williamson	2014-12-29	1	-1/+5
\| \| \| \| \|	Currently these are not violated, but this guards against future mistakes.
*	Don't raise 'poorly supported' locale warning unnecessarily	Karl Williamson	2014-12-29	1	-11/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 8c6180a91de91a1194f427fc639694f43a903a78 added a warning message for when Perl determines that the program's underlying locale just switched into is poorly supported. At the time it was thought that this would be an extremely rare occurrence. However, a bug in HP-UX - B.11.00/64 causes this message to be raised for the "C" locale. A workaround was done that silenced those. However, before it got fixed, this message would occur gobs of times executing the test suite. It was raised even if the script is not locale-aware, so that the underlying locale was completely irrelevant. There is a good prospect that someone using an older Asian locale as their default would get this message inappropriately, even if they don't use locales, or switch to a supported one before using them. This commit causes the message to be raised only if it actually is relevant. When not in the scope of 'use locale', the message is stored, not raised. Upon the first locale-dependent operation within a bad locale, the saved message is raised, and the storage cleared. I was able to do this without adding extra branching to the main-line non-locale execution code. This was done by adding regnodes which get jumped to by switch statements, and refactoring some existing C tests so they exclude non-locale right off the bat. These changes would have been necessary for another locale warning that I previously agreed to implement, and which is coming a few commits from now. I do not know of any way to add tests in the test suite for this. It is in fact rare for modern locales to have these issues. The way I tested this was to temporarily change the C code so that all locales are viewed as defective, and manually note that the warnings came out where expected, and only where expected. I chose not to try to output this warning on any POSIX functions called. I believe that all that are affected are deprecated or scheduled to be deprecated anyway. And POSIX is closer to the hardware of the machine. For convenience, I also don't output the message for some zero-length pattern matches. If something is going to be matched, the message will likely very soon be raised anyway.
*	Nits in comments	Karl Williamson	2014-12-29	1	-2/+2
\|
*	make more use of NOT_REACHED	Lukas Mai	2014-11-29	1	-2/+2
\| \| \| \|	In particular, remove all instances of 'assert(0);'.
*	Make is_invariant_string()	Karl Williamson	2014-11-26	1	-6/+5
\| \| \| \| \| \|	This is a more accurately named synonym for is_ascii_string(), which is retained. The old name is misleading to someone programming for non-ASCII platforms.
*	Improve API pod of is_ascii_string	Karl Williamson	2014-11-26	1	-4/+8
\|
*	utf8.c: Shorten long constant names, and simplify	Karl Williamson	2014-11-24	1	-6/+10
\| \| \| \| \| \| \|	The previous commit fixed a typo caused by it being hard to see the differences in a long ALL_CAP name. This uses #defines to type the long name only once, and compile-time variables so the expression for the length of strings only is specified once.
*	utf8.c: Was taking sizeof() wrong thing	Karl Williamson	2014-11-24	1	-1/+1
\| \| \| \| \| \|	This was a typo due to the long name. A future commit will make it cleaner. The sizeof() the wrong name evaluates to the right number on ASCII platforms, but not EBCDIC.
*	Add warning message for locale/Unicode intermixing	Karl Williamson	2014-11-14	1	-5/+21
\| \| \| \|	This is explained in the added perldiag entry.
*	uvoffuni_to_utf8_flags() die if platform can't handle	Karl Williamson	2014-10-21	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On non EBCDIC platforms currently any UV is encodable as UTF-8. (This would change if there were 128-bit words). Thus, much code assumes that nothing can go wrong when converting to UTF-8, and hence does no error checking. However, UTF-EBCDIC is only capable of representing code points below 2**32, so if there are 64-bit words, this function can fail. Prior to this patch, there was no real overflow check, and garbage was returned by this function if called with too large a number. While not ideal, the easiest thing to do is to just die for such a number, like we do for division by 0. This involves changing only code within this function, and not its many callers.
*	utf8.c: Improve debug message	Karl Williamson	2014-10-21	1	-2/+2
\| \| \| \| \| \|	This function was called with an empty string "" because that string was not actually needed in the function, except to better identify the source when there is an error. So change to specify the actual source.
*	utf8.c: Move an #ifndef for clarity	Father Chrysostomos	2014-09-12	1	-1/+1
\| \| \| \| \|	The comment really belongs inside it, as it refers to those two lines of code.
*	Remove obsolete comment from utf8.c	Father Chrysostomos	2014-09-12	1	-8/+0
\| \| \| \| \| \| \| \| \| \|	The call to save_re_context was removed by the previous commit. The commit before that stopped save_re_context from doing anything. Commit db2c6cb33 stopped the errsv_save line from triggering get-magic. So this comment, added in dc0c6abb4, no longer applies.
*	Don’t call save_re_context	Father Chrysostomos	2014-09-12	1	-1/+4
\| \| \| \|	It is an empty function.
*	perl #122747: localize PL_curpm to null in _core_swash_init	Yves Orton	2014-09-11	1	-2/+17
\| \| \| \| \| \| \| \| \| \| \| \|	Set PL_curpm to null before we do any swash intialization in _core_swash_init(). This "hides" the current regop from the swash code, with the intent of prevent weird reentrancy bugs when the swashes are initialized. Long term you could argue that we should just not use the regex engine to initialize a swash, and then this would be unnecessary. Thanks to FC for the suggestion!
*	utf8.c: Use slightly more efficient macro	Karl Williamson	2014-07-25	1	-2/+4
\| \| \| \| \| \| \| \|	Lowercasing a Latin-1 range character results in a latin-1 range character, so we can use the more restrictive macros that is slightly more efficient than the general ones. (This difference only is applicable on EBCDIC platforms, as the macros all expand to nothing on ASCII ones.
*	Use grok_atou instead of strtoul (no explicit strtol uses).	Jarkko Hietaniemi	2014-07-22	1	-7/+10
\|
*	Remove or downgrade unnecessary dVAR.	Jarkko Hietaniemi	2014-06-25	1	-35/+0
\| \| \| \| \| \| \| \|	You need to configure with g++ and -Accflags=-DPERL_GLOBAL_STRUCT or -Accflags=-DPERL_GLOBAL_STRUCT_PRIVATE to see any difference. (g++ does not do the "post-annotation" form of "unused".) The version code has some of these issues, reported upstream.
*	PERL_UNUSED_CONTEXT -> remove interp context where possible	Daniel Dragan	2014-06-24	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Removing context params will save machine code in the callers of these functions, and 1 ptr of stack space. Some of these funcs are heavily used as mg_find. The contexts can always be readded in the future the same way they were removed. This patch inspired by commit dc3bf40570. Also remove PERL_UNUSED_CONTEXT when its not needed. See removal candidate rejection rational in [perl #122106]. -Perl_hv_backreferences_p uses context in S_hv_auxinit commit 96a5add60f was wrong -Perl_whichsig_sv and Perl_whichsig_pv wrongly used PERL_UNUSED_CONTEXT from inception in commit 84c7b88cca -in authors opinion cast_ shouldn't be public API, no CPAN grep usage, can't be static and/or inline optimized since it is exported -Perl_my_unexec move to block where it is needed, make Win32 block, context free, for inlining likelyhood, private api and only 2 callers in core -Perl_my_dirfd make all blocks context free, then change proto -Perl_bytes_cmp_utf8 wrongly used PERL_UNUSED_CONTEXT from inception in commit fed3ba5d6b
*	Silence -Wunused-parameter my_perl under threads.	Jarkko Hietaniemi	2014-06-19	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	For S_ functions, remove the context. For Perl_ functions, add PERL_UNUSED_CONTEXT. Tricky because sometimes depends on DEBUGGING, and sometimes on whether we are have PERL_IMPLICIT_SYS. (Why all the mathoms Perl_is_uni_... and Perl_is_utf8_... functions are not being whined about is a mystery.) vutil.c (included via util.c) has one of these, but it's cpan/, and a known problem: https://rt.cpan.org/Ticket/Display.html?id=96100
*	Revert "/* NOTREACHED / belongs before* the unreachable."	Jarkko Hietaniemi	2014-06-19	1	-4/+2
\| \| \| \| \| \|	This reverts commit 148f39b7de6eae9ddd59e0b0aff691d6abea7aca. (Still needs more work, but wanted to see how well this passed with Jenkins.)
*	/* NOTREACHED / belongs before* the unreachable.	Jarkko Hietaniemi	2014-06-19	1	-2/+4
\| \| \| \| \| \|	Definitely not after it. It marks the start of the unreachable, not the first unrechable line. And if they are in that order, it looks better to linebreak after the lint hint.
*	Some low-hanging -Wunreachable-code fruits.	Jarkko Hietaniemi	2014-06-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- after return/croak/die/exit, return/break are pointless (break is not a terminator/separator, it's a goto) - after goto, another goto (!) is pointless - in some cases (usually function ends) introduce explicit NOT_REACHED to make the noreturn nature clearer (do not do this everywhere, though, since that would mean adding NOT_REACHED after every croak) - for the added NOT_REACHED also add /* NOTREACHED */ since NOT_REACHED is for gcc (and VC), while the comment is for linters - declaring variables in switch blocks is just too fragile: it kind of works for narrowing the scope (which is nice), but breaks the moment there are initializations for the variables (the initializations will be skipped since the flow will bypass the start of the block); in some easy cases simply hoist the declarations out of the block and move them earlier Note 1: Since after this patch the core is not yet -Wunreachable-code clean, not enabling that via cflags.SH, one needs to -Accflags=... it. Note 2: At least with the older gcc 4.4.7 there are far too many "unreachable code" warnings, which seem to go away with gcc 4.8, maybe better flow control analysis. Therefore, the warning should eventually be enabled only for modernish gccs (what about clang and Intel cc?)
*	rmv duplicate SvUV call in Perl__swash_inversion_hash	Darin McBride	2014-06-14	1	-3/+5
\|
*	Revert "Some low-hanging -Wunreachable-code fruits."	Jarkko Hietaniemi	2014-06-13	1	-1/+1
\| \| \| \| \| \| \|	This reverts commit 8c2b19724d117cecfa186d044abdbf766372c679. I don't understand - smoke-me came back happy with three separate reports... oh well, some other time.
*	Some low-hanging -Wunreachable-code fruits.	Jarkko Hietaniemi	2014-06-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- after croak/die/exit (or return), break (or return!) are pointless (break is not a terminator/separator, it's a promise of a jump) - after goto, another goto (!) is pointless - in some cases (usually function ends) introduce explicit NOT_REACHED to make the noreturn nature clearer (do not do this everywhere, though, since that would mean adding NOT_REACHED after every croak) - for the added NOT_REACHED also add /* NOTREACHED */ since NOT_REACHED is for gcc (and VC), while the comment is for linters - declaring variables in switch blocks is just too fragile: it kind of works for narrowing the scope (which is nice), but breaks the moment there are initializations for the variables (they will be skipped!); in some easy cases simply hoist the declarations out of the block and move them earlier There are still a few places left.
*	perlapi: Include general information	Karl Williamson	2014-06-05	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \|	Unlike other pod handling routines, autodoc requires the line following an =head1 to be non-empty for its text to be included in the paragraph started by the heading. If you fail to do this, silently the text will be omitted from perlapi. I went through the source code, and where it was apparent that the text was supposed to be in perlapi, deleted the empty line so it would be, with some revisions to make more sense. I added =cuts where I thought it best for the text to not be included.
*	Move some deprecated utf8-handling functions to mathoms	Karl Williamson	2014-05-31	1	-136/+17
\| \| \| \| \|	This entailed creating new internal functions for some of them to call so that the functionality can be retained during the deprecation period.
*	Make is_utf8_char_buf() a macro	Karl Williamson	2014-05-31	1	-1/+1
\| \| \| \| \| \|	This function is now more efficiently implemented as a synonym for isUTF8_CHAR(). We retain the Perl_is_utf8_char_buf() function for code that calls it that way.
*	Create isUTF8_CHAR() macro and use it	Karl Williamson	2014-05-31	1	-68/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This macro will inline the code to determine if a character is well-formed UTF-8 for code points below a certain value, falling back to a slower function for larger ones. On ASCII platforms, it will inline for well-beyond all legal Unicode code points. On EBCDIC, it currently does it for code points up to 0x3FFF. This could be increased, but our porting tests do the regen every time to make sure everything is ok, and making it larger slows that down. This is worked around on ASCII by normally commenting out the code that generates this info, but including in utf8.h a version that did get generated. This is static information and won't change. (This could be done for EBCDIC too, but I chose not to at this time as each code page has a different macro generated, and it gets ugly getting all of them in utf8.h) Using this macro allowed for simplification of several functions in utf8.c
*	utf8.c: Move a static function to inline.h	Karl Williamson	2014-05-31	1	-35/+3
\| \| \| \| \|	This is in preparation for it being called from outside utf8.c. It is renamed to have a leading underscore to emphasize its private nature
*	utf8.c: Move documentation next to its function	Karl Williamson	2014-05-30	1	-16/+16
\| \| \| \|	Somehow this pod stuff was orphaned from the function it describes.
*	utf8.c: Silence compiler warning	Karl Williamson	2014-05-29	1	-1/+1
\| \| \| \| \| \| \| \| \|	This was brought to my attention by Jarkko Hietaniemi. The compiler was complaining that a variable could be used uninitialized. In practice this doesn't happen, as it would only happen on bad data, and Perl itself generates the data used. (I suppose if the data got corrupted, it could happen.) This commit initializes the value unconditionally, which allows a conditional setting of it to be removed.
*	utf8.c: Move static function to embed.fnc	Karl Williamson	2014-05-29	1	-6/+8
\| \| \| \|	This automatically generates assertions for pointer arguments, etc.