delta/perl.git - github.com: perl/perl5.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Remove no longer necessary constants	Karl Williamson	2013-08-29	1	-6/+0
\| \| \| \| \| \|	These character constants were used only for a special edge case in trie construction that has been removed -- except for one instance in regexec.c which could just as well be some other character.
*	utf8.h, unicode_constants.h: Add some #defines.	Karl Williamson	2013-08-29	1	-0/+3
\| \| \| \|	These will be used in a future commit
*	unicode_constants.h: Add #defines for CR, LF	Karl Williamson	2013-08-29	1	-0/+2
\|
*	unicode_constants.h: Add #defines for Byte Order Mark	Karl Williamson	2013-08-29	1	-0/+2
\| \| \| \|	These will be used in future commits
*	unicode_constants.h: Add some #defines	Karl Williamson	2013-05-20	1	-0/+3
\| \| \| \|	These will be used in future commits
*	pp.c: Eliminate custom macro and use Copy() instead	Karl Williamson	2013-05-20	1	-0/+3
\| \| \| \| \| \|	I think it's clearer to use Copy. When I wrote this custom macro, we didn't have the infrastructure to generate a UTF-8 encoded string at compile time.
*	regen/unicode_constants.pl: Change #define name	Karl Williamson	2013-03-08	1	-1/+1
\| \| \| \| \|	This was added in the 5.17 series so there's no code relying on its current name. I think that the abbreviation is clearer.
*	regen/unicode_constants.pl: Make portable to non-ASCII	Karl Williamson	2013-03-08	1	-13/+14
\| \| \| \| \| \|	This now uses the U+ notation to indicate code points, which is unambiguous not matter what the platform's character set is. (charnames accepts the U+ notation)
*	regen/unicode_constants.pl: Remove unused constant	Karl Williamson	2013-03-08	1	-1/+0
\| \| \| \| \|	This was added in the 5.17 series, so can't be yet in the field; and isn't needed.
*	regcomp.c: Refactor join_exact() to handle all multi-char folds	Karl Williamson	2012-10-09	1	-4/+2
\| \| \| \| \| \| \| \| \| \|	join_exact() prior to this commit returned a delta for 3 problematic sequences showing that the minimum length they match is less than their nominal length. It turns out that this is needed for all multi-character fold sequences; our test suite just did not have the tests in it to show that. Tests that do show this will be added in a future commit, but code elsewhere must be fixed before they pass. regcomp.c
*	regen/unicode_constants.pl: Add name parameter	Karl Williamson	2012-09-13	1	-0/+1
\| \| \| \| \| \| \|	A future commit will want to use the first surrogate code point's UTF-8 value. Add this to the generated macros, and give it a name, since there is no official one. The program has to be modified to cope with this.
*	regexec.c: Use new macros instead of swashes	Karl Williamson	2012-09-13	1	-3/+0
\| \| \| \| \| \| \| \| \| \|	A previous commit has caused macros to be generated that will match Unicode code points of interest to the \X algorithm. This patch uses them. This speeds up modern Korean processing by 15%. Together with recent previous commits, the throughput of modern Korean under \X has more than doubled, and is now comparable to other languages (which have increased themselved by 35%)
*	Rename regen'd hdr to reflect expanded capabilities	Karl Williamson	2012-09-13	1	-0/+48
	The recently added utf8_strings.h has been expanded to include more than just strings. I'm renaming it to avoid confusion.