summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorBruno Haible <bruno@clisp.org>2009-04-05 16:12:24 +0200
committerBruno Haible <bruno@clisp.org>2009-04-05 16:12:24 +0200
commitd6b5eb017aef649be89aec6dcd2519844f2bd91a (patch)
tree0ffe9ee2a24dc23aada781d4abf88f4965e47bd3 /doc
parent720fc85a3daaf654e7edb05ae3c652e7d9d4b52c (diff)
downloadlibunistring-d6b5eb017aef649be89aec6dcd2519844f2bd91a.tar.gz
Add index entries.
Diffstat (limited to 'doc')
-rw-r--r--doc/libunistring.texi28
-rw-r--r--doc/unicase.texi14
-rw-r--r--doc/uniconv.texi2
-rw-r--r--doc/unictype.texi28
-rw-r--r--doc/unilbrk.texi3
-rw-r--r--doc/uniname.texi1
-rw-r--r--doc/uninorm.texi9
-rw-r--r--doc/uniregex.texi1
-rw-r--r--doc/unistdio.texi2
-rw-r--r--doc/unistr.texi16
-rw-r--r--doc/uniwbrk.texi2
-rw-r--r--doc/uniwidth.texi5
12 files changed, 108 insertions, 3 deletions
diff --git a/doc/libunistring.texi b/doc/libunistring.texi
index 6c907de..d0eff27 100644
--- a/doc/libunistring.texi
+++ b/doc/libunistring.texi
@@ -248,6 +248,8 @@ case folding
regular expressions (not yet implemented)
@end table
+@cindex use cases
+@cindex value, of libunistring
libunistring is for you if your application involves non-trivial text
processing, such as upper/lower case conversions, line breaking, operations
on words, or more advanced analysis of text. Text provided by the user can,
@@ -274,6 +276,7 @@ internal in-memory representation.
@node Unicode
@section Unicode
+@cindex Unicode
Unicode is a standardized repertoire of characters that contains characters
from all scripts of the world, from Latin letters to Chinese ideographs
and Babylonian cuneiform glyphs. It also specifies how these characters
@@ -283,6 +286,10 @@ to behave on Unicode text.
Unicode also specifies three ways of storing sequences of Unicode
characters in a computer whose basic unit of data is an 8-bit byte:
+@cindex UTF-8
+@cindex UTF-16
+@cindex UTF-32
+@cindex UCS-4
@table @asis
@item UTF-8
Every character is represented as 1 to 4 bytes.
@@ -320,6 +327,7 @@ Markus Kuhn's UTF-8 and Unicode FAQ:
@node Unicode and i18n
@section Unicode and Internationalization
+@cindex internationalization
Internationalization is the process of changing the source code of a program
so that it can meet the expectations of users in any culture, if culture
specific data (translations, images etc.) are provided.
@@ -352,12 +360,14 @@ POSIX APIs and the implementation of locales in the GNU C library.
@node Locale encodings
@section Locale encodings
+@cindex locale
A locale is a set of cultural conventions. According to POSIX, for a program,
at any moment, there is one locale being designated as the ``current locale''.
(Actually, POSIX supports also one locale per thread, but this feature is not
-yet universally implemented and not widely used.) The locale is partitioned
-into several aspects, called the ``categories'' of the locale. The main
-various aspects are:
+yet universally implemented and not widely used.)
+@cindex locale categories
+The locale is partitioned into several aspects, called the ``categories''
+of the locale. The main various aspects are:
@itemize
@item
The character encoding and the character properties. This is the
@@ -377,6 +387,7 @@ category.
The formatting of date and time. This is the @code{LC_TIME} category.
@end itemize
+@cindex locale encoding
In particular, the @code{LC_CTYPE} category of the current locale determines
the character encoding. This is the encoding of @samp{char *} strings.
We also call it the ``locale encoding''. GNU libunistring has a function,
@@ -425,6 +436,7 @@ see @ref{The wchar_t mess}.
@node char * strings
@section @samp{char *} strings
+@cindex C string functions
The classical C strings, with its C library support standardized by
ISO C and POSIX, can be used in internationalized programs with some
precautions. The problem with this API is that many of the C library
@@ -432,6 +444,7 @@ functions for strings don't work correctly on strings in locale
encodings, leading to bugs that only people in some cultures of the
world will experience.
+@cindex locale, multibyte
The first problem with the C library API is the support of multibyte
locales. According to the locale encoding, in general, every character
is represented by one or more bytes (up to 4 bytes in practice --- but
@@ -442,6 +455,7 @@ to realize that the majority of Unix installations nowadays use UTF-8
or GB18030 as locale encoding; therefore, the majority of users are
using multibyte locales.
+@cindex char, type
The important fact to remember is:
@cartouche
@emph{A @samp{char} is a byte, not a character.}
@@ -552,6 +566,7 @@ This is implemented in this library, through the functions declared in @code{<un
@node The wchar_t mess
@section The @code{wchar_t} mess
+@cindex wchar_t, type
The ISO C and POSIX standard creators made an attempt to fix the first
problem mentioned in the previous section. They introduced
@itemize
@@ -604,6 +619,9 @@ the program to produce garbage or abort.
@section Unicode strings
libunistring supports Unicode strings in three representations:
+@cindex UTF-8, strings
+@cindex UTF-16, strings
+@cindex UTF-32, strings
@itemize
@item
UTF-8 strings, through the type @samp{uint8_t *}. The units are bytes
@@ -636,6 +654,7 @@ zero-valued unit used as ``end marker''.
This chapter explains conventions valid throughout the libunistring library.
+@cindex argument conventions
Variables of type @code{char *} denote C strings in locale encoding.
See @ref{Locale encodings}.
@@ -674,6 +693,7 @@ All parameters starting with @samp{str} and the parameters of
functions starting with @code{u8_str}/@code{u16_str}/@code{u32_str}
denote a NUL terminated string.
+@cindex return value conventions
Error values are always returned through the @code{errno} variable,
usually with a return value that indicates the presence of an error
(NULL for functions that return an pointer, or -1 for functions that
@@ -704,9 +724,11 @@ NULL is returned and @code{errno} is set.
@node More functionality
@chapter More advanced functionality
+@cindex bidirectional reordering
For bidirectional reordering of strings, we recommend the GNU FriBidi library:
@url{http://www.fribidi.org/}.
+@cindex rendering
For the rendering of Unicode strings outside of the context of a given toolkit
(KDE/Qt or GNOME/Gtk), we recommend the Pango library:
@url{http://www.pango.org/}.
diff --git a/doc/unicase.texi b/doc/unicase.texi
index 6fa86c7..e4ccdf8 100644
--- a/doc/unicase.texi
+++ b/doc/unicase.texi
@@ -19,6 +19,7 @@ Greek sigma and the Lithuanian i correctly.
@node Case mappings of characters
@section Case mappings of characters
+@cindex Unicode character, case mappings
The following functions implement case mappings on Unicode characters ---
for those cases only where the result of the mapping is a again a single
Unicode character.
@@ -46,6 +47,10 @@ Returns the titlecase mapping of the Unicode character @var{uc}.
@node Case mappings of strings
@section Case mappings of strings
+@cindex case mappings
+@cindex uppercasing
+@cindex lowercasing
+@cindex titlecasing
Case mapping should always be performed on entire strings, not on individual
characters. The functions in this sections do so.
@@ -56,6 +61,7 @@ a character, U+00C4 @sc{LATIN CAPITAL LETTER A WITH DIAERESIS} and
U+0041 @sc{LATIN CAPITAL LETTER A} U+0308 @sc{COMBINING DIAERESIS} the same.
The @var{nf} argument designates the normalization.
+@cindex locale language
These functions are locale dependent. The @var{iso639_language} argument
identifies the language (e.g. @code{"tr"} for Turkish). NULL means to use
locale independent case mappings.
@@ -95,6 +101,8 @@ case-mapping. It can also be NULL, for no normalization.
@node Case insensitive comparison
@section Case insensitive comparison
+@cindex comparing, ignoring case
+@cindex comparing, ignoring normalization and case
The following functions implement comparison that ignores differences in case
and normalization.
@@ -125,6 +133,10 @@ If successful, sets @code{*@var{resultp}} to -1 if @var{s1} < @var{s2},
Upon failure, returns -1 with @code{errno} set.
@end deftypefun
+@cindex comparing, ignoring case, with collation rules
+@cindex comparing, with collation rules, ignoring case
+@cindex comparing, ignoring normalization and case, with collation rules
+@cindex comparing, with collation rules, ignoring normalization and case
The following functions additionally take into account the sorting rules of the
current locale.
@@ -160,6 +172,8 @@ Upon failure, returns -1 with @code{errno} set.
@node Case detection
@section Case detection
+@cindex case detection
+@cindex detecting case
The following functions determine whether a Unicode string is entirely in
upper case. or entirely in lower case, or entirely in title case, or already
case-folded.
diff --git a/doc/uniconv.texi b/doc/uniconv.texi
index 8ff8ffa..08197c5 100644
--- a/doc/uniconv.texi
+++ b/doc/uniconv.texi
@@ -4,6 +4,7 @@
This include file declares functions for converting between Unicode strings
and @code{char *} strings in locale encoding or in other specified encodings.
+@cindex locale encoding
The following function returns the locale encoding.
@deftypefun {const char *} locale_charset ()
@@ -20,6 +21,7 @@ around the native @code{iconv_open} function. It may not work as an argument
to the native @code{iconv_open} function directly.
@end deftypefun
+@cindex converting
The following functions convert between strings in a specified encoding and
Unicode strings.
diff --git a/doc/unictype.texi b/doc/unictype.texi
index 26d0d6a..2c40d8f 100644
--- a/doc/unictype.texi
+++ b/doc/unictype.texi
@@ -29,6 +29,9 @@ in the presence of specific Unicode characters.
@node General category
@section General category
+@cindex general category
+@cindex Unicode character, general category
+@cindex Unicode character, classification
Every Unicode character or code point has a @emph{general category} assigned
to it. This classification is important for most algorithms that work on
Unicode text.
@@ -359,6 +362,8 @@ This function uses a big table comprising all general categories.
@node Canonical combining class
@section Canonical combining class
+@cindex canonical combining class
+@cindex Unicode character, canonical combining class
Every Unicode character or code point has a @emph{canonical combining class}
assigned to it.
@@ -461,6 +466,8 @@ Returns the canonical combining class of a Unicode character.
@node Bidirectional category
@section Bidirectional category
+@cindex bidirectional category
+@cindex Unicode character, bidirectional category
Every Unicode character or code point has a @emph{bidirectional category}
assigned to it.
@@ -569,6 +576,8 @@ Tests whether a Unicode character belongs to a given bidirectional category.
@node Decimal digit value
@section Decimal digit value
+@cindex value, of Unicode character
+@cindex Unicode character, value
Decimal digits (like the digits from @samp{0} to @samp{9}) exist in many
scripts. The following function converts a decimal digit character to its
numerical value.
@@ -582,6 +591,8 @@ do not represent a decimal digit.
@node Digit value
@section Digit value
+@cindex value, of Unicode character
+@cindex Unicode character, value
Digit characters are like decimal digit characters, possibly in special forms,
like as superscript, subscript, or circled. The following function converts a
digit character to its numerical value.
@@ -595,6 +606,8 @@ do not represent a digit.
@node Numeric value
@section Numeric value
+@cindex value, of Unicode character
+@cindex Unicode character, value
There are also characters that represent numbers without a digit system, like
the Roman numerals, and fractional numbers, like 1/4 or 3/4.
@@ -620,6 +633,8 @@ characters that do not represent a number.
@node Mirrored character
@section Mirrored character
+@cindex mirroring, of Unicode character
+@cindex Unicode character, mirroring
Character mirroring is used to associate the closing parenthesis character
to the opening parenthesis character, the closing brace character with the
closing brace character, and so on.
@@ -635,6 +650,8 @@ stores @var{uc} unmodified in @code{*@var{puc}} and returns @code{false}.
@node Properties
@section Properties
+@cindex properties, of Unicode character
+@cindex Unicode character, properties
This section defines boolean properties of Unicode characters. This
means, a character either has the given property or does not have it.
In other words, the property can be viewed as a subset of the set of
@@ -915,6 +932,7 @@ Other miscellaneous properties are:
@node Scripts
@section Scripts
+@cindex scripts
The Unicode characters are subdivided into scripts.
The following type is used to represent a script:
@@ -929,6 +947,7 @@ const char *name;
The @code{name} field contains the name of the script.
@end deftp
+@cindex Unicode character, script
The following functions look up a script.
@deftypefun {const uc_script_t *} uc_script (ucs4_t @var{uc})
@@ -957,6 +976,7 @@ Get the list of all scripts. Stores a pointer to an array of all scripts in
@node Blocks
@section Blocks
+@cindex block
The Unicode characters are subdivided into blocks. A block is an interval of
Unicode code points.
@@ -978,6 +998,7 @@ The @code{end} field is the last Unicode code point in the block.
The @code{name} field is the name of the block.
@end deftp
+@cindex Unicode character, block
The following function looks up a block.
@deftypefun {const uc_block_t *} uc_block (ucs4_t @var{uc})
@@ -1000,6 +1021,9 @@ Get the list of all blocks. Stores a pointer to an array of all blocks in
@node ISO C and Java syntax
@section ISO C and Java syntax
+@cindex C, programming language
+@cindex Java, programming language
+@cindex identifiers
The following properties are taken from language standards. The supported
language standards are ISO C 99 and Java.
@@ -1035,11 +1059,13 @@ This return value (only for Java) means that the given character is ignorable.
The following function determine whether a given character can be a constituent
of an identifier in the given programming language.
+@cindex Unicode character, validity in C identifiers
@deftypefun int uc_c_ident_category (ucs4_t @var{uc})
Returns the categorization of a Unicode character with respect to the ISO C 99
identifier syntax.
@end deftypefun
+@cindex Unicode character, validity in Java identifiers
@deftypefun int uc_java_ident_category (ucs4_t @var{uc})
Returns the categorization of a Unicode character with respect to the Java
identifier syntax.
@@ -1048,6 +1074,8 @@ identifier syntax.
@node Classifications like in ISO C
@section Classifications like in ISO C
+@cindex C-like API
+@cindex Unicode character, classification like in C
The following character classifications mimic those declared in the ISO C
header files @code{<ctype.h>} and @code{<wctype.h>}. These functions are
deprecated, because this set of functions was designed with ASCII in mind and
diff --git a/doc/unilbrk.texi b/doc/unilbrk.texi
index 545bc3b..5441f31 100644
--- a/doc/unilbrk.texi
+++ b/doc/unilbrk.texi
@@ -1,6 +1,9 @@
@node unilbrk.h
@chapter Line breaking @code{<unilbrk.h>}
+@cindex line breaks
+@cindex breaks, line
+@cindex wrapping
This include file declares functions for determining where in a string
line breaks could or should be introduced, in order to make the displayed
string fit into a column of given width.
diff --git a/doc/uniname.texi b/doc/uniname.texi
index 4124158..b3d9a38 100644
--- a/doc/uniname.texi
+++ b/doc/uniname.texi
@@ -1,6 +1,7 @@
@node uniname.h
@chapter Names of Unicode characters @code{<uniname.h>}
+@cindex Unicode character, name
This include file implements the association between a Unicode character and
its name.
diff --git a/doc/uninorm.texi b/doc/uninorm.texi
index 4e476e4..2903c4c 100644
--- a/doc/uninorm.texi
+++ b/doc/uninorm.texi
@@ -1,6 +1,8 @@
@node uninorm.h
@chapter Normalization forms (composition and decomposition) @code{<uninorm.h>}
+@cindex normal forms
+@cindex normalizing
This include file defines functions for transforming Unicode strings to one
of the four normal forms, known as NFC, NFD, NKFC, NFKD. These
transformations involve decomposition and --- for NFC and NFKC --- composition
@@ -17,6 +19,7 @@ of Unicode characters.
@node Decomposition of characters
@section Decomposition of Unicode characters
+@cindex decomposing
The following enumerated values are the possible types of decomposition of a
Unicode character.
@@ -135,6 +138,8 @@ and @var{n} is returned. Otherwise -1 is returned.
@node Composition of characters
@section Composition of Unicode characters
+@cindex composing, Unicode characters
+@cindex combining, Unicode characters
The following function composes a Unicode character from two Unicode
characters.
@@ -204,6 +209,7 @@ Returns the specified normalization form of a string.
@node Normalizing comparisons
@section Normalizing comparisons
+@cindex comparing, ignoring normalization
The following functions compare Unicode string, ignoring differences in
normalization.
@@ -219,6 +225,8 @@ If successful, sets @code{*@var{resultp}} to -1 if @var{s1} < @var{s2},
Upon failure, returns -1 with @code{errno} set.
@end deftypefun
+@cindex comparing, ignoring normalization, with collation rules
+@cindex comparing, with collation rules, ignoring normalization
@deftypefun {char *} u8_normxfrm (const uint8_t *@var{s}, size_t @var{n}, uninorm_t @var{nf}, char *@var{resultbuf}, size_t *@var{lengthp})
@deftypefunx {char *} u16_normxfrm (const uint16_t *@var{s}, size_t @var{n}, uninorm_t @var{nf}, char *@var{resultbuf}, size_t *@var{lengthp})
@deftypefunx {char *} u32_normxfrm (const uint32_t *@var{s}, size_t @var{n}, uninorm_t @var{nf}, char *@var{resultbuf}, size_t *@var{lengthp})
@@ -246,6 +254,7 @@ Upon failure, returns -1 with @code{errno} set.
@node Normalization of streams
@section Normalization of streams of Unicode characters
+@cindex stream, normalizing a
A ``stream of Unicode characters'' is essentially a function that accepts an
@code{ucs4_t} argument repeatedly, optionally combined with a function that
``flushes'' the stream.
diff --git a/doc/uniregex.texi b/doc/uniregex.texi
index 2c2df74..ae290ff 100644
--- a/doc/uniregex.texi
+++ b/doc/uniregex.texi
@@ -1,4 +1,5 @@
@node uniregex.h
@chapter Regular expressions @code{<uniregex.h>}
+@cindex regular expression
This include file is not yet implemented.
diff --git a/doc/unistdio.texi b/doc/unistdio.texi
index 42a1eee..e1fb9cf 100644
--- a/doc/unistdio.texi
+++ b/doc/unistdio.texi
@@ -1,6 +1,8 @@
@node unistdio.h
@chapter Output with Unicode strings @code{<unistdio.h>}
+@cindex formatted output
+@cindex output, formatted
This include file declares functions for doing formatted output with Unicode
strings. It defines a set of functions similar to @code{fprintf} and
@code{sprintf}, which are declared in @code{<stdio.h>}.
diff --git a/doc/unistr.texi b/doc/unistr.texi
index 32e97e9..da95413 100644
--- a/doc/unistr.texi
+++ b/doc/unistr.texi
@@ -15,6 +15,8 @@ essentially the equivalent of what @code{<string.h>} is for C strings.
@node Elementary string checks
@section Elementary string checks
+@cindex validity
+@cindex verification
The following function is available to verify the integrity of a Unicode string.
@deftypefun {const uint8_t *} u8_check (const uint8_t *@var{s}, size_t @var{n})
@@ -27,6 +29,7 @@ It returns NULL if valid, or a pointer to the first invalid unit otherwise.
@node Elementary string conversions
@section Elementary string conversions
+@cindex converting
The following functions perform conversions between the different forms of Unicode strings.
@deftypefun {uint16_t *} u8_to_u16 (const uint8_t *@var{s}, size_t @var{n}, uint16_t *@var{resultbuf}, size_t *@var{lengthp})
@@ -56,6 +59,7 @@ Converts an UTF-32 string to an UTF-16 string.
@node Elementary string functions
@section Elementary string functions
+@cindex iterating
The following functions inspect and return details about the first character
in a Unicode string.
@@ -122,6 +126,7 @@ Unicode strings, @var{s} must not be NULL, and the argument @var{n} must be
specified.
@end deftypefun
+@cindex copying
The following functions copy Unicode strings in memory.
@deftypefun {uint8_t *} u8_cpy (uint8_t *@var{dest}, const uint8_t *@var{src}, size_t @var{n})
@@ -155,6 +160,7 @@ This function is similar to @posixfunc{memset}, except that it operates on
Unicode strings.
@end deftypefun
+@cindex comparing
The following function compares two Unicode strings of the same length.
@deftypefun int u8_cmp (const uint8_t *@var{s1}, const uint8_t *@var{s2}, size_t @var{n})
@@ -184,6 +190,7 @@ This function is similar to @func{memcmp2}, except that it operates on
Unicode strings.
@end deftypefun
+@cindex searching, for a character
The following function searches for a given Unicode character.
@deftypefun {uint8_t *} u8_chr (const uint8_t *@var{s}, size_t @var{n}, ucs4_t @var{uc})
@@ -197,6 +204,7 @@ This function is similar to @posixfunc{memchr}, except that it operates on
Unicode strings.
@end deftypefun
+@cindex counting
The following function counts the number of Unicode characters.
@deftypefun size_t u8_mbsnlen (const uint8_t *@var{s}, size_t @var{n})
@@ -212,6 +220,7 @@ it operates on Unicode strings.
@node Elementary string functions with memory allocation
@section Elementary string functions with memory allocation
+@cindex duplicating
The following function copies a Unicode string.
@deftypefun {uint8_t *} u8_cpy_alloc (const uint8_t *@var{s}, size_t @var{n})
@@ -233,6 +242,7 @@ Returns the length (number of units) of the first character in @var{s}.
Returns 0 if it is the NUL character. Returns -1 upon failure.
@end deftypefun
+@cindex iterating
@deftypefun int u8_strmbtouc (ucs4_t *@var{puc}, const uint8_t *@var{s})
@deftypefunx int u16_strmbtouc (ucs4_t *@var{puc}, const uint16_t *@var{s})
@deftypefunx int u32_strmbtouc (ucs4_t *@var{puc}, const uint32_t *@var{s})
@@ -280,6 +290,7 @@ This function is similar to @posixfunc{strnlen} and @posixfunc{wcsnlen}, except
that it operates on Unicode strings.
@end deftypefun
+@cindex copying
The following functions copy portions of Unicode strings in memory.
@deftypefun {uint8_t *} u8_strcpy (uint8_t *@var{dest}, const uint8_t *@var{src})
@@ -338,6 +349,7 @@ This function is similar to @posixfunc{strncat} and @posixfunc{wcsncat}, except
that it operates on Unicode strings.
@end deftypefun
+@cindex comparing
The following functions compare two Unicode strings.
@deftypefun int u8_strcmp (const uint8_t *@var{s1}, const uint8_t *@var{s2})
@@ -352,6 +364,7 @@ This function is similar to @posixfunc{strcmp} and @posixfunc{wcscmp}, except
that it operates on Unicode strings.
@end deftypefun
+@cindex comparing, with collation rules
@deftypefun int u8_strcoll (const uint8_t *@var{s1}, const uint8_t *@var{s2})
@deftypefunx int u16_strcoll (const uint16_t *@var{s1}, const uint16_t *@var{s2})
@deftypefunx int u32_strcoll (const uint32_t *@var{s1}, const uint32_t *@var{s2})
@@ -377,6 +390,7 @@ This function is similar to @posixfunc{strncmp} and @posixfunc{wcsncmp}, except
that it operates on Unicode strings.
@end deftypefun
+@cindex duplicating
The following function allocates a duplicate of a Unicode string.
@deftypefun {uint8_t *} u8_strdup (const uint8_t *@var{s})
@@ -388,6 +402,7 @@ This function is similar to @posixfunc{strdup} and @posixfunc{wcsdup}, except
that it operates on Unicode strings.
@end deftypefun
+@cindex searching, for a character
The following functions search for a given Unicode character.
@deftypefun {uint8_t *} u8_strchr (const uint8_t *@var{str}, ucs4_t @var{uc})
@@ -440,6 +455,7 @@ This function is similar to @posixfunc{strpbrk} and @posixfunc{wcspbrk}, except
that it operates on Unicode strings.
@end deftypefun
+@cindex searching, for a substring
The following functions search whether a given Unicode string is a substring
of another Unicode string.
diff --git a/doc/uniwbrk.texi b/doc/uniwbrk.texi
index 4c1a2a1..7b081fb 100644
--- a/doc/uniwbrk.texi
+++ b/doc/uniwbrk.texi
@@ -1,6 +1,8 @@
@node uniwbrk.h
@chapter Word breaks in strings @code{<uniwbrk.h>}
+@cindex word breaks
+@cindex breaks, word
This include file declares functions for determining where in a string
``words'' start and end. Here ``words'' are not necessarily the same as
entities that can be looked up in dictionaries, but rather groups of
diff --git a/doc/uniwidth.texi b/doc/uniwidth.texi
index 8c53d04..a05d101 100644
--- a/doc/uniwidth.texi
+++ b/doc/uniwidth.texi
@@ -1,10 +1,12 @@
@node uniwidth.h
@chapter Display width @code{<uniwidth.h>}
+@cindex width
This include file declares functions that return the display width, measured
in columns, of characters or strings, when output to a device that uses
non-proportional fonts.
+@cindex ambiguous width
Note that for some rarely used characters the actual fonts or terminal
emulators can use a different width. There is no mechanism for communicating
the display width of characters across a Unix pseudo-terminal (tty). Also,
@@ -16,6 +18,9 @@ most characters but can fail to represent the actual display width.
These functions are locale dependent. The @var{encoding} argument identifies
the encoding (e.g@. @code{"ISO-8859-2"} for Polish).
+@cindex Unicode character, width
+@cindex halfwidth
+@cindex fullwidth
@deftypefun int uc_width (ucs4_t @var{uc}, const char *@var{encoding})
Determines and returns the number of column positions required for @var{uc}.
Returns -1 if @var{uc} is a control character that has an influence on the