| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
This is an automated commit created by the Maintenance project
https://github.com/eksperimental/maintenance
Before merging, please read the release notes by visiting
<http://www.unicode.org/versions/Unicode15.0.0/>
and assess if additional changes are necessary in the code base.
|
|
|
|
| |
Make it possible to update the binary testfile when updating version.
|
| |
|
|
|
|
|
|
| |
Do not return bad codepoints such as -1.
Improve the guards and check that the code make errors for bad input
in list strings.
|
|
|
|
|
|
|
|
| |
Category can be useful to user programs, such as in the terminal
handling.
Loaded code increases with these two commits with 21%
Beam size increases from 23%
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From a non east asian perspective we can limit the number of wide
codepoints to ~120 ranges.
We loose a lot of 'width' information but can keep the file size
small. It should be ok to increase the file size with that size.
This is useful when editing fix-width characters to count the number
of columns displayed. Wide characters should take 2 columns in standard
terminal.
Emoji presentation sequences are not validated, if the presentation
selector is included the sequence is assumed to be correct and wide.
|
|\
| |
| |
| | |
stdlib: Keep the tail of the strings passed to string:next_grapheme/1
OTP-18009
|
| | |
|
| | |
|
| | |
|
|/ |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
Could expand binary to list for to many elements.
Fix and add tests.
|
|
|
|
|
|
|
| |
Fixed bug in slice which wrongly could return <<>> for non-utf8 binary input.
Also give a better error reason when non-utf8 binaries are given as
input to some functions.
|
|
|
|
| |
Unroll some of the functions returning codepoints and grapheme clusters.
|
|
|
|
|
| |
The unicode_util:cp() function handles deep lists faster by returning
the rest of the input more balanced to the right than before.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update input files for the code-generator and tests.
Added emoji-data.txt for new rule with how to handle emoji.
Unicode have simpliefied the rules for emoji grapheme-clusters:
From:
GB10 (E_Base | EBG) Extend* × E_Modifier
GB11 ZWJ × (Glue_After_Zwj | EBG)
To:
GB11 \p{Extended_Pictographic} Extend* ZWJ × \p{Extended_Pictographic}
Update the code generator to handle the new way.
|
|\
| |
| |
| |
| |
| |
| | |
* maint:
Avoid falling measurements testcases on slow machines
stdlib: string optimize special case for ASCII
stdlib: Minor unicode_util opts
|
| |
| |
| |
| | |
Exit early for Latin-1
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
* siri/string-new-api: (28 commits)
hipe (test): Do not use deprecated functions in string(3)
dialyzer (test): Do not use deprecated functions in string(3)
eunit (test): Do not use deprecated functions in string(3)
system (test): Do not use deprecated functions in string(3)
system (test): Do not use deprecated functions in string(3)
mnesia (test): Do not use deprecated functions in string(3)
Deprecate old string functions
observer: Do not use deprecated functions in string(3)
common_test: Do not use deprecated functions in string(3)
eldap: Do not use deprecated functions in string(3)
et: Do not use deprecated functions in string(3)
os_mon: Do not use deprecated functions in string(3)
debugger: Do not use deprecated functions in string(3)
runtime_tools: Do not use deprecated functions in string(3)
asn1: Do not use deprecated functions in string(3)
compiler: Do not use deprecated functions in string(3)
sasl: Do not use deprecated functions in string(3)
reltool: Do not use deprecated functions in string(3)
kernel: Do not use deprecated functions in string(3)
hipe: Do not use deprecated functions in string(3)
...
Conflicts:
lib/eunit/src/eunit_lib.erl
lib/observer/src/crashdump_viewer.erl
lib/reltool/src/reltool_target.erl
|
| |/
| |
| |
| | |
They should not be used.
|
|/ |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to this patch, the normalization functions in the
unicode module would raise a function clause error for
non-utf8 binaries.
This patch changes it so it returns {error, SoFar, Invalid}
as characters_to_binary and characters_to_list does in
the unicode module.
Note string:next_codepoint/1 and string:next_grapheme had
to be changed accordingly and also return an error tuple.
|
|
A base for unicode functions, not intended to be a user api.
Whitespace returns a reasonable subset of non nobreak whitespace
characters.
Implementation notes:
Make function clauses instead of using arrays and store tuples instead
of maps to save space.
|