summaryrefslogtreecommitdiff
path: root/strings
Commit message (Collapse)AuthorAgeFilesLines
* Merge 10.10 into 10.11Marko Mäkelä2023-03-069-64/+64
|\
| * A cleanup for MDEV-30695 Refactor case folding data types in Asian collationsAlexander Barkov2023-03-039-64/+64
| | | | | | | | Adding "const" qualifiers to casefold_info_st::page
* | Merge 10.10 into 10.11Marko Mäkelä2023-02-2823-8793/+8543
|\ \ | |/
| * Merge 10.9 into 10.10Marko Mäkelä2023-02-281-1/+1
| |\
| | * Merge 10.8 into 10.9Marko Mäkelä2023-02-281-1/+1
| | |\
| | | * MDEV-30716 Wrong casefolding in xxx_unicode_520_ci for U+0700..U+07FFAlexander Barkov2023-02-231-1/+1
| | | | | | | | | | | | | | | | | | | | The array my_unicase_pages_unicode520[7] erroneously mapped to plane06 instead of plane07.
| * | | MDEV-30694: Cross building on x86_64 to arch i686 failsHelmut Grohne2023-02-221-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently cross compilation on x86_64 to arch i686 fails with error: > ctype-uca1400data.h /bin/sh: 1: uca-dump: not found Commit makes sure that uca-dump is treated correctly when cross compiling MariaDB to another architecture
| * | | MDEV-30695 Refactor case folding data types in Asian collationsAlexander Barkov2023-02-2119-7842/+8168
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a non-functional change and should not change the server behavior. Casefolding information is now stored in items of a new data type MY_CASEFOLD_CHARACTER: typedef struct casefold_info_char_t { uint32 toupper; uint32 tolower; } MY_CASEFOLD_CHARACTER; Before this change, casefolding tables for Asian collations were stored in: typedef struct unicase_info_char_st { uint32 toupper; uint32 tolower; uint32 sort; } MY_UNICASE_CHARACTER; The "sort" member was not used in the code handling Asian collations, it only wasted space. (it's only used by Unicode _general_ci and _general_mysql500_ci collations). Unicode collations (at least UCA and _bin) should also be refactored later, but under terms of a separate task.
| * | | MDEV-30692 conf_to_src is not up to dateAlexander Barkov2023-02-212-253/+266
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixing conf_to_src.c according to changes made by a206658b985fe5e18fb5692fdb3698dad5aca70a Re-generating ctype-extra.c at once, to fix the indentation from manually edited to automatic.
| * | | MDEV-30661 UPPER() returns an empty string for U+0251 in uca1400 collations ↵Alexander Barkov2023-02-1722-696/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for utf8 String length growth during upper/lower conversion in Unicode collations depends only on the underlying MY_UNICASE_INFO used in the collation. Maintaining a separate member CHARSET_INFO::caseup_multiply and CHARSET_INFO::casedn_multiply duplicated this information and caused bugs like this (when MY_UNICASE_INFO and case??_multiply when out of sync because of incomplete CHARSET_INFO initialization). Fix: Changing CHARSET_INFO::caseup_multiply and CHARSET_INFO::casedn_multiply from members to virtual functions. The virtual functions in Unicode collations calculate case conversion growth factors from the MY_UNICASE_INFO. This guarantees that the growth factors are always in sync with the MY_UNICASE_INFO.
* | | | Merge 10.10 into 10.11Marko Mäkelä2023-02-161-16/+21
|\ \ \ \ | |/ / /
| * | | Merge 10.9 into 10.10Marko Mäkelä2023-02-161-16/+21
| |\ \ \ | | |/ /
| | * | Merge 10.8 into 10.9Marko Mäkelä2023-02-161-16/+21
| | |\ \ | | | |/
| | | * Merge 10.6 into 10.8Marko Mäkelä2023-02-101-16/+21
| | | |\
| | | | * Merge 10.5 into 10.6Marko Mäkelä2023-02-101-16/+21
| | | | |\
| | | | | * Merge 10.4 into 10.5Marko Mäkelä2023-02-101-16/+21
| | | | | |\
| | | | | | * MDEV-30556 UPPER() returns an empty string for U+0251 in Unicode-5.2.0+ ↵Alexander Barkov2023-02-031-16/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | collations for utf8
* | | | | | | Merge branch '10.10' into 10.11Oleksandr Byelkin2023-01-311-1/+1
|\ \ \ \ \ \ \ | |/ / / / / /
| * | | | | | Merge branch '10.9' into 10.10Oleksandr Byelkin2023-01-311-1/+1
| |\ \ \ \ \ \ | | |/ / / / /
| | * | | | | Merge branch '10.8' into 10.9Oleksandr Byelkin2023-01-311-1/+1
| | |\ \ \ \ \ | | | |/ / / /
| | | * | | | Merge branch '10.7' into 10.8Oleksandr Byelkin2023-01-311-1/+1
| | | |\ \ \ \
| | | | * \ \ \ Merge branch '10.6' into 10.7Oleksandr Byelkin2023-01-311-1/+1
| | | | |\ \ \ \ | | | | | |/ / /
| | | | | * | | Merge branch '10.5' into 10.6Oleksandr Byelkin2023-01-311-1/+1
| | | | | |\ \ \ | | | | | | |/ /
| | | | | | * | Merge branch '10.4' into 10.5Oleksandr Byelkin2023-01-271-1/+1
| | | | | | |\ \ | | | | | | | |/
| | | | | | | * MDEV-26817 runtime error: index 24320 out of bounds for type ↵Sergei Golubchik2023-01-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'json_string_char_classes [128] *and* ASAN: global-buffer-overflow on address ... READ of size 4 on SELECT JSON_VALID protect from out-of-bound array access it was already done in all other places, this one was the only one missed
* | | | | | | | Merge 10.10 into 10.11Marko Mäkelä2023-01-111-2/+2
|\ \ \ \ \ \ \ \ | |/ / / / / / /
| * | | | | | | Merge 10.9 into 10.10Marko Mäkelä2023-01-101-2/+2
| |\ \ \ \ \ \ \ | | |/ / / / / /
| | * | | | | | MDEV-29381: SON paths containing dashes are reported as syntax errors inAlexander Freiherr von Buddenbrock2023-01-061-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | procedures MDEV-22224 caused the parsing of keys with hyphens to break by setting the state transitions for parsing keys to JE_SYN (syntax error) when they encounter a hyphen. However json key names may contain hyphens and still be considered valid json. This patch changes the state transition table so that key names with hyphens remain valid. Note that unquoted key names in paths like $.key-name are also valid again. This restores the previous behaviour when hyphens were considered part of the P_ETC character class.
* | | | | | | | Merge 10.10 into 10.11Marko Mäkelä2022-12-141-1/+4
|\ \ \ \ \ \ \ \ | |/ / / / / / /
| * | | | | | | Merge 10.9 into 10.10Marko Mäkelä2022-12-141-1/+4
| |\ \ \ \ \ \ \ | | |/ / / / / /
| | * | | | | | Merge 10.8 into 10.9Marko Mäkelä2022-12-131-1/+4
| | |\ \ \ \ \ \ | | | |/ / / / /
| | | * | | | | Merge 10.7 into 10.8Marko Mäkelä2022-12-131-1/+4
| | | |\ \ \ \ \ | | | | |/ / / /
| | | | * | | | Merge 10.6 into 10.7Marko Mäkelä2022-12-131-1/+4
| | | | |\ \ \ \ | | | | | |/ / /
| | | | | * | | Merge 10.5 into 10.6Marko Mäkelä2022-12-131-1/+4
| | | | | |\ \ \ | | | | | | |/ /
| | | | | | * | Merge 10.4 into 10.5Marko Mäkelä2022-12-131-1/+4
| | | | | | |\ \ | | | | | | | |/
| | | | | | | * Merge 10.3 into 10.4Marko Mäkelä2022-12-131-1/+4
| | | | | | | |\
| | | | | | | | * MDEV-29473 UBSAN: Signed integer overflow: X * Y cannot be represented in ↵Alexander Barkov2022-11-171-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | type 'int' in strings/dtoa.c Fixing a few problems relealed by UBSAN in type_float.test - multiplication overflow in dtoa.c - uninitialized Field::geom_type (and Field::srid as well) - Wrong call-back function types used in combination with SHOW_FUNC. Changes in the mysql_show_var_func data type definition were not properly addressed all around the code by the following commits: b4ff64568c88ab3ce559e7bd39853d9cbf86704a 18feb62feeb833494d003615861b9c78ec008a90 0ee879ff8ac1b80cd9a963015344f5698a81f309 Adding a helper SHOW_FUNC_ENTRY() function and replacing all mysql_show_var_func declarations using SHOW_FUNC to SHOW_FUNC_ENTRY, to catch mysql_show_var_func in the future at compilation time.
| | | | | | | * | MDEV-27670 Assertion `(cs->state & 0x20000) == 0' failed in ↵Alexander Barkov2022-11-221-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | my_strnncollsp_nchars_generic_8bit Also fixes: MDEV-27768 MDEV-25440: Assertion `(cs->state & 0x20000) == 0' failed in my_strnncollsp_nchars_generic_8bit The "strnncollsp_nchars" virtual function pointer for tis620_thai_nopad_ci was incorrectly initialized to a generic function my_strnncollsp_nchars_generic_8bit(), which crashed on assert. Implementing a tis620 specific function version.
* | | | | | | | | Merge 10.10 into 10.11Marko Mäkelä2022-12-071-2/+14
|\ \ \ \ \ \ \ \ \ | |/ / / / / / / /
| * | | | | | | | Merge 10.9 into 10.10Marko Mäkelä2022-12-071-2/+14
| |\ \ \ \ \ \ \ \ | | |/ / / / / / /
| | * | | | | | | Merge 10.8 into 10.9Marko Mäkelä2022-12-071-2/+14
| | |\ \ \ \ \ \ \ | | | |/ / / / / /
| | | * | | | | | Merge 10.7 into 10.8Marko Mäkelä2022-12-071-2/+14
| | | |\ \ \ \ \ \ | | | | |/ / / / /
| | | | * | | | | Merge 10.6 into 10.7Marko Mäkelä2022-12-071-2/+14
| | | | |\ \ \ \ \ | | | | | |/ / / /
| | | | | * | | | Merge 10.5 into 10.6Marko Mäkelä2022-12-051-2/+14
| | | | | |\ \ \ \ | | | | | | |/ / /
| | | | | | * | | Merge 10.4 into 10.5Jan Lindström2022-11-301-2/+14
| | | | | | | | |
* | | | | | | | | Merge 10.10 into 10.11Jan Lindström2022-09-063-72/+109
|\ \ \ \ \ \ \ \ \ | |/ / / / / / / /
| * | | | | | | | A follow-up patch MDEV-27266 Improve UCA collation performance for utf8mb3 ↵Alexander Barkov2022-09-023-72/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and utf8mb4 Moving these members: CHARSET_INFO *cs; const MY_UCA_WEIGHT_LEVEL *level; from my_uca_scanner to a new separate structure my_uca_scanner_param. Rationale: During a comparison of two strings these members were initialized two times (one time for every string). After the change these members initialized only one time inside a shared instance of my_uca_scanner_param, and the instance is shared between two scanners (its const address is passed as new a parameter to the underlying scanner functions). This change gives a slight performance improvement (~5%).
* | | | | | | | | Merge 10.10 into 10.11Marko Mäkelä2022-08-302-58/+33
|\ \ \ \ \ \ \ \ \ | |/ / / / / / / /
| * | | | | | | | Merge 10.9 into 10.10Marko Mäkelä2022-08-302-58/+33
| |\ \ \ \ \ \ \ \ | | |/ / / / / / /
| | * | | | | | | Merge 10.8 into 10.9Marko Mäkelä2022-08-302-58/+33
| | |\ \ \ \ \ \ \ | | | |/ / / / / /