<feed xmlns='http://www.w3.org/2005/Atom'>
<title>delta/php-git.git/ext/mbstring/unicode_data.h, branch php-7.4.0RC4</title>
<subtitle>git.php.net: repository/php-src.git
</subtitle>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/php-git.git/'/>
<entry>
<title>Update data tables for Unicode 11</title>
<updated>2018-06-11T18:25:37+00:00</updated>
<author>
<name>Nikita Popov</name>
<email>nikita.ppv@gmail.com</email>
</author>
<published>2018-06-11T18:24:36+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/php-git.git/commit/?id=f2be6e732a0c18d5415b8372aee102829374545a'/>
<id>f2be6e732a0c18d5415b8372aee102829374545a</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Trailing whitespaces on ext/*</title>
<updated>2018-01-04T04:38:32+00:00</updated>
<author>
<name>Gabriel Caruso</name>
<email>carusogabriel34@gmail.com</email>
</author>
<published>2018-01-04T04:38:32+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/php-git.git/commit/?id=2238403892ccf87143a59814538d9f764509d9e7'/>
<id>2238403892ccf87143a59814538d9f764509d9e7</id>
<content type='text'>
Signed-off-by: Gabriel Caruso &lt;carusogabriel34@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Gabriel Caruso &lt;carusogabriel34@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Fixed bug #65544 and #71298</title>
<updated>2017-07-28T12:57:08+00:00</updated>
<author>
<name>Nikita Popov</name>
<email>nikita.ppv@gmail.com</email>
</author>
<published>2017-07-28T12:57:08+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/php-git.git/commit/?id=f4a1d9c8211fa7878af14d0bd94b2deaab19ae21'/>
<id>f4a1d9c8211fa7878af14d0bd94b2deaab19ae21</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Implement full case mapping</title>
<updated>2017-07-28T10:32:50+00:00</updated>
<author>
<name>Nikita Popov</name>
<email>nikita.ppv@gmail.com</email>
</author>
<published>2017-07-27T20:48:00+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/php-git.git/commit/?id=582a65b06f3de125887cab02d5c561168fcf94bc'/>
<id>582a65b06f3de125887cab02d5c561168fcf94bc</id>
<content type='text'>
Implement full case mapping according to SpecialCasing.txt and
also full case folding according to CaseFolding.txt (F). There
are a number of caveats:

* Only language-agnostic and unconditional full case mapping
  is implemented. The only language-agnostic conditional case
  mapping rule relates to Greek sigma in final position
  (Final_Sigma). Correctly handling this requires both arbitrary
  lookahead and lookbehind, which would require some larger
  changes to how the case mapping is implemented. This is a
  possible future extension.
* The only language-specific handling that is implemented is
  for Turkish dotted/undotted Is, if the ISO-8859-9 encoding
  is used. This matches the previous behavior and makes sure
  that no codepoints not supported by the encoding are
  produced. A future extension would be to also handle the
  Turkish mappings specified by SpecialCasing.txt based on
  the mbfl internal language.
* Full case folding is implemented, but case-insensitive mb_*
  operations continue to use simple case folding. The reason is
  that full case folding of the haystack string may change the
  position at which a match occurred. This would have to be
  mapped back into the position in the original string.
* mb_convert_case() exposes both the full and the simple case
  mapping / folding, where full is the default. The constants
  are:

   * MB_CASE_LOWER (used by mb_strtolower)
   * MB_CASE_UPPER (used by mb_strtolower)
   * MB_CASE_TITLE
   * MB_CASE_FOLD
   * MB_CASE_LOWER_SIMPLE
   * MB_CASE_UPPER_SIMPLE
   * MB_CASE_TITLE_SIMPLE
   * MB_CASE_FOLD_SIMPLE (used by case-insensitive operations)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Implement full case mapping according to SpecialCasing.txt and
also full case folding according to CaseFolding.txt (F). There
are a number of caveats:

* Only language-agnostic and unconditional full case mapping
  is implemented. The only language-agnostic conditional case
  mapping rule relates to Greek sigma in final position
  (Final_Sigma). Correctly handling this requires both arbitrary
  lookahead and lookbehind, which would require some larger
  changes to how the case mapping is implemented. This is a
  possible future extension.
* The only language-specific handling that is implemented is
  for Turkish dotted/undotted Is, if the ISO-8859-9 encoding
  is used. This matches the previous behavior and makes sure
  that no codepoints not supported by the encoding are
  produced. A future extension would be to also handle the
  Turkish mappings specified by SpecialCasing.txt based on
  the mbfl internal language.
* Full case folding is implemented, but case-insensitive mb_*
  operations continue to use simple case folding. The reason is
  that full case folding of the haystack string may change the
  position at which a match occurred. This would have to be
  mapped back into the position in the original string.
* mb_convert_case() exposes both the full and the simple case
  mapping / folding, where full is the default. The constants
  are:

   * MB_CASE_LOWER (used by mb_strtolower)
   * MB_CASE_UPPER (used by mb_strtolower)
   * MB_CASE_TITLE
   * MB_CASE_FOLD
   * MB_CASE_LOWER_SIMPLE
   * MB_CASE_UPPER_SIMPLE
   * MB_CASE_TITLE_SIMPLE
   * MB_CASE_FOLD_SIMPLE (used by case-insensitive operations)
</pre>
</div>
</content>
</entry>
<entry>
<title>Use case-folding for case insensitive comparisons</title>
<updated>2017-07-28T10:32:50+00:00</updated>
<author>
<name>Nikita Popov</name>
<email>nikita.ppv@gmail.com</email>
</author>
<published>2017-07-27T18:39:14+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/php-git.git/commit/?id=9ac7c1e71d956ddac63b042be6ad8b105e584c10'/>
<id>9ac7c1e71d956ddac63b042be6ad8b105e584c10</id>
<content type='text'>
Instead of using lowercasing.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Instead of using lowercasing.
</pre>
</div>
</content>
</entry>
<entry>
<title>Use MPH for case maps</title>
<updated>2017-07-28T10:32:50+00:00</updated>
<author>
<name>Nikita Popov</name>
<email>nikita.ppv@gmail.com</email>
</author>
<published>2017-07-25T22:06:17+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/php-git.git/commit/?id=80a0601fe52b9dddbef34a168a2c1136177bda23'/>
<id>80a0601fe52b9dddbef34a168a2c1136177bda23</id>
<content type='text'>
Instead of performing a binary search, use a hashtable to store
the case maps. In particular a minimal perfect hash construction
is used, which does not require collision resolution (but does
use an auxiliary table for the hash perturbation).
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Instead of performing a binary search, use a hashtable to store
the case maps. In particular a minimal perfect hash construction
is used, which does not require collision resolution (but does
use an auxiliary table for the hash perturbation).
</pre>
</div>
</content>
</entry>
<entry>
<title>Don't store titlecase if same as uppercase</title>
<updated>2017-07-28T10:32:50+00:00</updated>
<author>
<name>Nikita Popov</name>
<email>nikita.ppv@gmail.com</email>
</author>
<published>2017-07-25T20:35:15+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/php-git.git/commit/?id=eacd70f762eee90ebcc2b207cd91c49d23cafc1b'/>
<id>eacd70f762eee90ebcc2b207cd91c49d23cafc1b</id>
<content type='text'>
The totitle code already has a fallback for that case.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The totitle code already has a fallback for that case.
</pre>
</div>
</content>
</entry>
<entry>
<title>Drop implementation-specific character properties</title>
<updated>2017-07-28T10:32:50+00:00</updated>
<author>
<name>Nikita Popov</name>
<email>nikita.ppv@gmail.com</email>
</author>
<published>2017-07-25T16:59:44+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/php-git.git/commit/?id=cedfc2f426745ff7538e633f11e935db2f2a1115'/>
<id>cedfc2f426745ff7538e633f11e935db2f2a1115</id>
<content type='text'>
No point in keeping around non-standard character properties if
we're not using them and most are not even being populated.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
No point in keeping around non-standard character properties if
we're not using them and most are not even being populated.
</pre>
</div>
</content>
</entry>
<entry>
<title>Handle character ranges in ucgendat generically</title>
<updated>2017-07-25T16:48:12+00:00</updated>
<author>
<name>Nikita Popov</name>
<email>nikita.ppv@gmail.com</email>
</author>
<published>2017-07-25T16:42:43+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/php-git.git/commit/?id=8ace7045e9a1f9852c172ef382f1160a55724b8a'/>
<id>8ace7045e9a1f9852c172ef382f1160a55724b8a</id>
<content type='text'>
In particular, the previous implementation did not account for
Tangut Ideographs and CJK Ideograph extensions C through F.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In particular, the previous implementation did not account for
Tangut Ideographs and CJK Ideograph extensions C through F.
</pre>
</div>
</content>
</entry>
<entry>
<title>Fix handling of some special ranges in ucgendat</title>
<updated>2017-07-25T16:48:12+00:00</updated>
<author>
<name>Nikita Popov</name>
<email>nikita.ppv@gmail.com</email>
</author>
<published>2017-07-25T15:15:24+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/php-git.git/commit/?id=4bd61ec7ad5460a942ab10d0ee2db45c8c21d333'/>
<id>4bd61ec7ad5460a942ab10d0ee2db45c8c21d333</id>
<content type='text'>
* Han Ideagraphs go up to U+9FEA.
* CJK Compatibility Ideographs are no longer specified as a special
  range in remotely recent versions of Unicode.
* Surrogate properties should be assigned to U+D800-U+DFFF, not to
  U+10000-U+1FFFF.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Han Ideagraphs go up to U+9FEA.
* CJK Compatibility Ideographs are no longer specified as a special
  range in remotely recent versions of Unicode.
* Surrogate properties should be assigned to U+D800-U+DFFF, not to
  U+10000-U+1FFFF.
</pre>
</div>
</content>
</entry>
</feed>
