[doc] Update about "case insensitive" and issue with Turkish locales

for "I" / "i". * mpfr.texi: added "with the rules of the C locale" in the mpfr_strtofr description. * README.dev: completed information about Turkish locales. git-svn-id: https://scm.gforge.inria.fr/anonscm/svn/mpfr/trunk@14505 280ebfd0-de03-0410-8827-d642c229c3f4
author: vlefevre <vlefevre@280ebfd0-de03-0410-8827-d642c229c3f4> 2021-04-26 12:17:08 +0000
committer: vlefevre <vlefevre@280ebfd0-de03-0410-8827-d642c229c3f4> 2021-04-26 12:17:08 +0000
commit: c7171788844166655de22f8bf356408afb659f77 (patch)
tree: 6320341b2b4a6eb9678f9679a2223ad50727dbea /doc
parent: c6395d0fbc71b64f4a940f85e7ea1fd00cdca876 (diff)
download: mpfr-c7171788844166655de22f8bf356408afb659f77.tar.gz
2 files changed, 18 insertions, 5 deletions
diff --git a/doc/README.dev b/doc/README.dev
index d833c477d..caa8255ac 100644
--- a/doc/README.dev
+++ b/doc/README.dev
@@ -872,10 +872,16 @@ Conversely, do not use locale-dependent functions when the result must
 not depend on the locales. In particular, the alphanumeric characters
 used in number strings (as created by mpfr_get_str) must be those of
 the required characters from the basic character set (see ISO C99
-standard Section 5.2.1 "Character sets"). And tolower(letter) does
-not necessarily return the corresponding lowercase letter from these
-required characters. For instance, tolower('I') returns a dotless 'i'
-in Turkish tr_TR.iso88599 locales.
+standard Section 5.2.1 "Character sets").
+
+Note that in Turkish locales on some systems:
+  * the uppercase version of "i" is "İ" (an "I" with a dot above);
+  * the lowercase version of "I" is "ı" (a dotless "i").
+These characters are available in ISO-8859-9, thus as "char" in the
+tr_TR.iso88599 locale. However, in UTF-8, they are not available as
+(8-bit) "char"; thus toupper('i') gives 'i' and tolower('I') gives 'I'.
+So, when writing code and testing, these two encodings need to be
+considered, as they can give different behaviors.
 
 ===========================================================================
 
diff --git a/doc/mpfr.texi b/doc/mpfr.texi
index 18b23f908..31e141b74 100644
--- a/doc/mpfr.texi
+++ b/doc/mpfr.texi
@@ -1540,7 +1540,8 @@ stops at the character @samp{0}, thus 0 is read.
 Special data (for infinities and NaN) can be @samp{@@inf@@} or
 @samp{@@nan@@(n-char-sequence-opt)}, and if @math{@var{base} @le{} 16},
 it can also be @samp{infinity}, @samp{inf}, @samp{nan} or
-@samp{nan(n-char-sequence-opt)}, all case insensitive.
+@samp{nan(n-char-sequence-opt)}, all case insensitive with the rules of
+the C locale.
 A @samp{n-char-sequence-opt} is a possibly empty string containing only digits,
 Latin letters and the underscore (0, 1, 2, @dots{}, 9, a, b, @dots{}, z,
 A, B, @dots{}, Z, _). Note: one has an optional sign for all data, even
@@ -1548,6 +1549,12 @@ NaN@.
 For example, @samp{-@@nAn@@(This_Is_Not_17)} is a valid representation for NaN
 in base 17.
 
+@c Note about the "case insensitive with the rules of the C locale":
+@c The reason is that in Turkish locales on some systems, the uppercase
+@c version of "i" is an "I" with a dot above, and the lowercase version
+@c of "I" is a dotless "i". We do not follow these rules here.
+@c See README.dev for additional information.
+
 @end deftypefun
 
 @deftypefun void mpfr_set_nan (mpfr_t @var{x})
author	vlefevre <vlefevre@280ebfd0-de03-0410-8827-d642c229c3f4>	2021-04-26 12:17:08 +0000
committer	vlefevre <vlefevre@280ebfd0-de03-0410-8827-d642c229c3f4>	2021-04-26 12:17:08 +0000
commit	c7171788844166655de22f8bf356408afb659f77 (patch)
tree	6320341b2b4a6eb9678f9679a2223ad50727dbea /doc
parent	c6395d0fbc71b64f4a940f85e7ea1fd00cdca876 (diff)
download	mpfr-c7171788844166655de22f8bf356408afb659f77.tar.gz