diff options
author | Bruno Haible <bruno@clisp.org> | 2009-04-05 16:12:24 +0200 |
---|---|---|
committer | Bruno Haible <bruno@clisp.org> | 2009-04-05 16:12:24 +0200 |
commit | d6b5eb017aef649be89aec6dcd2519844f2bd91a (patch) | |
tree | 0ffe9ee2a24dc23aada781d4abf88f4965e47bd3 /doc | |
parent | 720fc85a3daaf654e7edb05ae3c652e7d9d4b52c (diff) | |
download | libunistring-d6b5eb017aef649be89aec6dcd2519844f2bd91a.tar.gz |
Add index entries.
Diffstat (limited to 'doc')
-rw-r--r-- | doc/libunistring.texi | 28 | ||||
-rw-r--r-- | doc/unicase.texi | 14 | ||||
-rw-r--r-- | doc/uniconv.texi | 2 | ||||
-rw-r--r-- | doc/unictype.texi | 28 | ||||
-rw-r--r-- | doc/unilbrk.texi | 3 | ||||
-rw-r--r-- | doc/uniname.texi | 1 | ||||
-rw-r--r-- | doc/uninorm.texi | 9 | ||||
-rw-r--r-- | doc/uniregex.texi | 1 | ||||
-rw-r--r-- | doc/unistdio.texi | 2 | ||||
-rw-r--r-- | doc/unistr.texi | 16 | ||||
-rw-r--r-- | doc/uniwbrk.texi | 2 | ||||
-rw-r--r-- | doc/uniwidth.texi | 5 |
12 files changed, 108 insertions, 3 deletions
diff --git a/doc/libunistring.texi b/doc/libunistring.texi index 6c907de..d0eff27 100644 --- a/doc/libunistring.texi +++ b/doc/libunistring.texi @@ -248,6 +248,8 @@ case folding regular expressions (not yet implemented) @end table +@cindex use cases +@cindex value, of libunistring libunistring is for you if your application involves non-trivial text processing, such as upper/lower case conversions, line breaking, operations on words, or more advanced analysis of text. Text provided by the user can, @@ -274,6 +276,7 @@ internal in-memory representation. @node Unicode @section Unicode +@cindex Unicode Unicode is a standardized repertoire of characters that contains characters from all scripts of the world, from Latin letters to Chinese ideographs and Babylonian cuneiform glyphs. It also specifies how these characters @@ -283,6 +286,10 @@ to behave on Unicode text. Unicode also specifies three ways of storing sequences of Unicode characters in a computer whose basic unit of data is an 8-bit byte: +@cindex UTF-8 +@cindex UTF-16 +@cindex UTF-32 +@cindex UCS-4 @table @asis @item UTF-8 Every character is represented as 1 to 4 bytes. @@ -320,6 +327,7 @@ Markus Kuhn's UTF-8 and Unicode FAQ: @node Unicode and i18n @section Unicode and Internationalization +@cindex internationalization Internationalization is the process of changing the source code of a program so that it can meet the expectations of users in any culture, if culture specific data (translations, images etc.) are provided. @@ -352,12 +360,14 @@ POSIX APIs and the implementation of locales in the GNU C library. @node Locale encodings @section Locale encodings +@cindex locale A locale is a set of cultural conventions. According to POSIX, for a program, at any moment, there is one locale being designated as the ``current locale''. (Actually, POSIX supports also one locale per thread, but this feature is not -yet universally implemented and not widely used.) The locale is partitioned -into several aspects, called the ``categories'' of the locale. The main -various aspects are: +yet universally implemented and not widely used.) +@cindex locale categories +The locale is partitioned into several aspects, called the ``categories'' +of the locale. The main various aspects are: @itemize @item The character encoding and the character properties. This is the @@ -377,6 +387,7 @@ category. The formatting of date and time. This is the @code{LC_TIME} category. @end itemize +@cindex locale encoding In particular, the @code{LC_CTYPE} category of the current locale determines the character encoding. This is the encoding of @samp{char *} strings. We also call it the ``locale encoding''. GNU libunistring has a function, @@ -425,6 +436,7 @@ see @ref{The wchar_t mess}. @node char * strings @section @samp{char *} strings +@cindex C string functions The classical C strings, with its C library support standardized by ISO C and POSIX, can be used in internationalized programs with some precautions. The problem with this API is that many of the C library @@ -432,6 +444,7 @@ functions for strings don't work correctly on strings in locale encodings, leading to bugs that only people in some cultures of the world will experience. +@cindex locale, multibyte The first problem with the C library API is the support of multibyte locales. According to the locale encoding, in general, every character is represented by one or more bytes (up to 4 bytes in practice --- but @@ -442,6 +455,7 @@ to realize that the majority of Unix installations nowadays use UTF-8 or GB18030 as locale encoding; therefore, the majority of users are using multibyte locales. +@cindex char, type The important fact to remember is: @cartouche @emph{A @samp{char} is a byte, not a character.} @@ -552,6 +566,7 @@ This is implemented in this library, through the functions declared in @code{<un @node The wchar_t mess @section The @code{wchar_t} mess +@cindex wchar_t, type The ISO C and POSIX standard creators made an attempt to fix the first problem mentioned in the previous section. They introduced @itemize @@ -604,6 +619,9 @@ the program to produce garbage or abort. @section Unicode strings libunistring supports Unicode strings in three representations: +@cindex UTF-8, strings +@cindex UTF-16, strings +@cindex UTF-32, strings @itemize @item UTF-8 strings, through the type @samp{uint8_t *}. The units are bytes @@ -636,6 +654,7 @@ zero-valued unit used as ``end marker''. This chapter explains conventions valid throughout the libunistring library. +@cindex argument conventions Variables of type @code{char *} denote C strings in locale encoding. See @ref{Locale encodings}. @@ -674,6 +693,7 @@ All parameters starting with @samp{str} and the parameters of functions starting with @code{u8_str}/@code{u16_str}/@code{u32_str} denote a NUL terminated string. +@cindex return value conventions Error values are always returned through the @code{errno} variable, usually with a return value that indicates the presence of an error (NULL for functions that return an pointer, or -1 for functions that @@ -704,9 +724,11 @@ NULL is returned and @code{errno} is set. @node More functionality @chapter More advanced functionality +@cindex bidirectional reordering For bidirectional reordering of strings, we recommend the GNU FriBidi library: @url{http://www.fribidi.org/}. +@cindex rendering For the rendering of Unicode strings outside of the context of a given toolkit (KDE/Qt or GNOME/Gtk), we recommend the Pango library: @url{http://www.pango.org/}. diff --git a/doc/unicase.texi b/doc/unicase.texi index 6fa86c7..e4ccdf8 100644 --- a/doc/unicase.texi +++ b/doc/unicase.texi @@ -19,6 +19,7 @@ Greek sigma and the Lithuanian i correctly. @node Case mappings of characters @section Case mappings of characters +@cindex Unicode character, case mappings The following functions implement case mappings on Unicode characters --- for those cases only where the result of the mapping is a again a single Unicode character. @@ -46,6 +47,10 @@ Returns the titlecase mapping of the Unicode character @var{uc}. @node Case mappings of strings @section Case mappings of strings +@cindex case mappings +@cindex uppercasing +@cindex lowercasing +@cindex titlecasing Case mapping should always be performed on entire strings, not on individual characters. The functions in this sections do so. @@ -56,6 +61,7 @@ a character, U+00C4 @sc{LATIN CAPITAL LETTER A WITH DIAERESIS} and U+0041 @sc{LATIN CAPITAL LETTER A} U+0308 @sc{COMBINING DIAERESIS} the same. The @var{nf} argument designates the normalization. +@cindex locale language These functions are locale dependent. The @var{iso639_language} argument identifies the language (e.g. @code{"tr"} for Turkish). NULL means to use locale independent case mappings. @@ -95,6 +101,8 @@ case-mapping. It can also be NULL, for no normalization. @node Case insensitive comparison @section Case insensitive comparison +@cindex comparing, ignoring case +@cindex comparing, ignoring normalization and case The following functions implement comparison that ignores differences in case and normalization. @@ -125,6 +133,10 @@ If successful, sets @code{*@var{resultp}} to -1 if @var{s1} < @var{s2}, Upon failure, returns -1 with @code{errno} set. @end deftypefun +@cindex comparing, ignoring case, with collation rules +@cindex comparing, with collation rules, ignoring case +@cindex comparing, ignoring normalization and case, with collation rules +@cindex comparing, with collation rules, ignoring normalization and case The following functions additionally take into account the sorting rules of the current locale. @@ -160,6 +172,8 @@ Upon failure, returns -1 with @code{errno} set. @node Case detection @section Case detection +@cindex case detection +@cindex detecting case The following functions determine whether a Unicode string is entirely in upper case. or entirely in lower case, or entirely in title case, or already case-folded. diff --git a/doc/uniconv.texi b/doc/uniconv.texi index 8ff8ffa..08197c5 100644 --- a/doc/uniconv.texi +++ b/doc/uniconv.texi @@ -4,6 +4,7 @@ This include file declares functions for converting between Unicode strings and @code{char *} strings in locale encoding or in other specified encodings. +@cindex locale encoding The following function returns the locale encoding. @deftypefun {const char *} locale_charset () @@ -20,6 +21,7 @@ around the native @code{iconv_open} function. It may not work as an argument to the native @code{iconv_open} function directly. @end deftypefun +@cindex converting The following functions convert between strings in a specified encoding and Unicode strings. diff --git a/doc/unictype.texi b/doc/unictype.texi index 26d0d6a..2c40d8f 100644 --- a/doc/unictype.texi +++ b/doc/unictype.texi @@ -29,6 +29,9 @@ in the presence of specific Unicode characters. @node General category @section General category +@cindex general category +@cindex Unicode character, general category +@cindex Unicode character, classification Every Unicode character or code point has a @emph{general category} assigned to it. This classification is important for most algorithms that work on Unicode text. @@ -359,6 +362,8 @@ This function uses a big table comprising all general categories. @node Canonical combining class @section Canonical combining class +@cindex canonical combining class +@cindex Unicode character, canonical combining class Every Unicode character or code point has a @emph{canonical combining class} assigned to it. @@ -461,6 +466,8 @@ Returns the canonical combining class of a Unicode character. @node Bidirectional category @section Bidirectional category +@cindex bidirectional category +@cindex Unicode character, bidirectional category Every Unicode character or code point has a @emph{bidirectional category} assigned to it. @@ -569,6 +576,8 @@ Tests whether a Unicode character belongs to a given bidirectional category. @node Decimal digit value @section Decimal digit value +@cindex value, of Unicode character +@cindex Unicode character, value Decimal digits (like the digits from @samp{0} to @samp{9}) exist in many scripts. The following function converts a decimal digit character to its numerical value. @@ -582,6 +591,8 @@ do not represent a decimal digit. @node Digit value @section Digit value +@cindex value, of Unicode character +@cindex Unicode character, value Digit characters are like decimal digit characters, possibly in special forms, like as superscript, subscript, or circled. The following function converts a digit character to its numerical value. @@ -595,6 +606,8 @@ do not represent a digit. @node Numeric value @section Numeric value +@cindex value, of Unicode character +@cindex Unicode character, value There are also characters that represent numbers without a digit system, like the Roman numerals, and fractional numbers, like 1/4 or 3/4. @@ -620,6 +633,8 @@ characters that do not represent a number. @node Mirrored character @section Mirrored character +@cindex mirroring, of Unicode character +@cindex Unicode character, mirroring Character mirroring is used to associate the closing parenthesis character to the opening parenthesis character, the closing brace character with the closing brace character, and so on. @@ -635,6 +650,8 @@ stores @var{uc} unmodified in @code{*@var{puc}} and returns @code{false}. @node Properties @section Properties +@cindex properties, of Unicode character +@cindex Unicode character, properties This section defines boolean properties of Unicode characters. This means, a character either has the given property or does not have it. In other words, the property can be viewed as a subset of the set of @@ -915,6 +932,7 @@ Other miscellaneous properties are: @node Scripts @section Scripts +@cindex scripts The Unicode characters are subdivided into scripts. The following type is used to represent a script: @@ -929,6 +947,7 @@ const char *name; The @code{name} field contains the name of the script. @end deftp +@cindex Unicode character, script The following functions look up a script. @deftypefun {const uc_script_t *} uc_script (ucs4_t @var{uc}) @@ -957,6 +976,7 @@ Get the list of all scripts. Stores a pointer to an array of all scripts in @node Blocks @section Blocks +@cindex block The Unicode characters are subdivided into blocks. A block is an interval of Unicode code points. @@ -978,6 +998,7 @@ The @code{end} field is the last Unicode code point in the block. The @code{name} field is the name of the block. @end deftp +@cindex Unicode character, block The following function looks up a block. @deftypefun {const uc_block_t *} uc_block (ucs4_t @var{uc}) @@ -1000,6 +1021,9 @@ Get the list of all blocks. Stores a pointer to an array of all blocks in @node ISO C and Java syntax @section ISO C and Java syntax +@cindex C, programming language +@cindex Java, programming language +@cindex identifiers The following properties are taken from language standards. The supported language standards are ISO C 99 and Java. @@ -1035,11 +1059,13 @@ This return value (only for Java) means that the given character is ignorable. The following function determine whether a given character can be a constituent of an identifier in the given programming language. +@cindex Unicode character, validity in C identifiers @deftypefun int uc_c_ident_category (ucs4_t @var{uc}) Returns the categorization of a Unicode character with respect to the ISO C 99 identifier syntax. @end deftypefun +@cindex Unicode character, validity in Java identifiers @deftypefun int uc_java_ident_category (ucs4_t @var{uc}) Returns the categorization of a Unicode character with respect to the Java identifier syntax. @@ -1048,6 +1074,8 @@ identifier syntax. @node Classifications like in ISO C @section Classifications like in ISO C +@cindex C-like API +@cindex Unicode character, classification like in C The following character classifications mimic those declared in the ISO C header files @code{<ctype.h>} and @code{<wctype.h>}. These functions are deprecated, because this set of functions was designed with ASCII in mind and diff --git a/doc/unilbrk.texi b/doc/unilbrk.texi index 545bc3b..5441f31 100644 --- a/doc/unilbrk.texi +++ b/doc/unilbrk.texi @@ -1,6 +1,9 @@ @node unilbrk.h @chapter Line breaking @code{<unilbrk.h>} +@cindex line breaks +@cindex breaks, line +@cindex wrapping This include file declares functions for determining where in a string line breaks could or should be introduced, in order to make the displayed string fit into a column of given width. diff --git a/doc/uniname.texi b/doc/uniname.texi index 4124158..b3d9a38 100644 --- a/doc/uniname.texi +++ b/doc/uniname.texi @@ -1,6 +1,7 @@ @node uniname.h @chapter Names of Unicode characters @code{<uniname.h>} +@cindex Unicode character, name This include file implements the association between a Unicode character and its name. diff --git a/doc/uninorm.texi b/doc/uninorm.texi index 4e476e4..2903c4c 100644 --- a/doc/uninorm.texi +++ b/doc/uninorm.texi @@ -1,6 +1,8 @@ @node uninorm.h @chapter Normalization forms (composition and decomposition) @code{<uninorm.h>} +@cindex normal forms +@cindex normalizing This include file defines functions for transforming Unicode strings to one of the four normal forms, known as NFC, NFD, NKFC, NFKD. These transformations involve decomposition and --- for NFC and NFKC --- composition @@ -17,6 +19,7 @@ of Unicode characters. @node Decomposition of characters @section Decomposition of Unicode characters +@cindex decomposing The following enumerated values are the possible types of decomposition of a Unicode character. @@ -135,6 +138,8 @@ and @var{n} is returned. Otherwise -1 is returned. @node Composition of characters @section Composition of Unicode characters +@cindex composing, Unicode characters +@cindex combining, Unicode characters The following function composes a Unicode character from two Unicode characters. @@ -204,6 +209,7 @@ Returns the specified normalization form of a string. @node Normalizing comparisons @section Normalizing comparisons +@cindex comparing, ignoring normalization The following functions compare Unicode string, ignoring differences in normalization. @@ -219,6 +225,8 @@ If successful, sets @code{*@var{resultp}} to -1 if @var{s1} < @var{s2}, Upon failure, returns -1 with @code{errno} set. @end deftypefun +@cindex comparing, ignoring normalization, with collation rules +@cindex comparing, with collation rules, ignoring normalization @deftypefun {char *} u8_normxfrm (const uint8_t *@var{s}, size_t @var{n}, uninorm_t @var{nf}, char *@var{resultbuf}, size_t *@var{lengthp}) @deftypefunx {char *} u16_normxfrm (const uint16_t *@var{s}, size_t @var{n}, uninorm_t @var{nf}, char *@var{resultbuf}, size_t *@var{lengthp}) @deftypefunx {char *} u32_normxfrm (const uint32_t *@var{s}, size_t @var{n}, uninorm_t @var{nf}, char *@var{resultbuf}, size_t *@var{lengthp}) @@ -246,6 +254,7 @@ Upon failure, returns -1 with @code{errno} set. @node Normalization of streams @section Normalization of streams of Unicode characters +@cindex stream, normalizing a A ``stream of Unicode characters'' is essentially a function that accepts an @code{ucs4_t} argument repeatedly, optionally combined with a function that ``flushes'' the stream. diff --git a/doc/uniregex.texi b/doc/uniregex.texi index 2c2df74..ae290ff 100644 --- a/doc/uniregex.texi +++ b/doc/uniregex.texi @@ -1,4 +1,5 @@ @node uniregex.h @chapter Regular expressions @code{<uniregex.h>} +@cindex regular expression This include file is not yet implemented. diff --git a/doc/unistdio.texi b/doc/unistdio.texi index 42a1eee..e1fb9cf 100644 --- a/doc/unistdio.texi +++ b/doc/unistdio.texi @@ -1,6 +1,8 @@ @node unistdio.h @chapter Output with Unicode strings @code{<unistdio.h>} +@cindex formatted output +@cindex output, formatted This include file declares functions for doing formatted output with Unicode strings. It defines a set of functions similar to @code{fprintf} and @code{sprintf}, which are declared in @code{<stdio.h>}. diff --git a/doc/unistr.texi b/doc/unistr.texi index 32e97e9..da95413 100644 --- a/doc/unistr.texi +++ b/doc/unistr.texi @@ -15,6 +15,8 @@ essentially the equivalent of what @code{<string.h>} is for C strings. @node Elementary string checks @section Elementary string checks +@cindex validity +@cindex verification The following function is available to verify the integrity of a Unicode string. @deftypefun {const uint8_t *} u8_check (const uint8_t *@var{s}, size_t @var{n}) @@ -27,6 +29,7 @@ It returns NULL if valid, or a pointer to the first invalid unit otherwise. @node Elementary string conversions @section Elementary string conversions +@cindex converting The following functions perform conversions between the different forms of Unicode strings. @deftypefun {uint16_t *} u8_to_u16 (const uint8_t *@var{s}, size_t @var{n}, uint16_t *@var{resultbuf}, size_t *@var{lengthp}) @@ -56,6 +59,7 @@ Converts an UTF-32 string to an UTF-16 string. @node Elementary string functions @section Elementary string functions +@cindex iterating The following functions inspect and return details about the first character in a Unicode string. @@ -122,6 +126,7 @@ Unicode strings, @var{s} must not be NULL, and the argument @var{n} must be specified. @end deftypefun +@cindex copying The following functions copy Unicode strings in memory. @deftypefun {uint8_t *} u8_cpy (uint8_t *@var{dest}, const uint8_t *@var{src}, size_t @var{n}) @@ -155,6 +160,7 @@ This function is similar to @posixfunc{memset}, except that it operates on Unicode strings. @end deftypefun +@cindex comparing The following function compares two Unicode strings of the same length. @deftypefun int u8_cmp (const uint8_t *@var{s1}, const uint8_t *@var{s2}, size_t @var{n}) @@ -184,6 +190,7 @@ This function is similar to @func{memcmp2}, except that it operates on Unicode strings. @end deftypefun +@cindex searching, for a character The following function searches for a given Unicode character. @deftypefun {uint8_t *} u8_chr (const uint8_t *@var{s}, size_t @var{n}, ucs4_t @var{uc}) @@ -197,6 +204,7 @@ This function is similar to @posixfunc{memchr}, except that it operates on Unicode strings. @end deftypefun +@cindex counting The following function counts the number of Unicode characters. @deftypefun size_t u8_mbsnlen (const uint8_t *@var{s}, size_t @var{n}) @@ -212,6 +220,7 @@ it operates on Unicode strings. @node Elementary string functions with memory allocation @section Elementary string functions with memory allocation +@cindex duplicating The following function copies a Unicode string. @deftypefun {uint8_t *} u8_cpy_alloc (const uint8_t *@var{s}, size_t @var{n}) @@ -233,6 +242,7 @@ Returns the length (number of units) of the first character in @var{s}. Returns 0 if it is the NUL character. Returns -1 upon failure. @end deftypefun +@cindex iterating @deftypefun int u8_strmbtouc (ucs4_t *@var{puc}, const uint8_t *@var{s}) @deftypefunx int u16_strmbtouc (ucs4_t *@var{puc}, const uint16_t *@var{s}) @deftypefunx int u32_strmbtouc (ucs4_t *@var{puc}, const uint32_t *@var{s}) @@ -280,6 +290,7 @@ This function is similar to @posixfunc{strnlen} and @posixfunc{wcsnlen}, except that it operates on Unicode strings. @end deftypefun +@cindex copying The following functions copy portions of Unicode strings in memory. @deftypefun {uint8_t *} u8_strcpy (uint8_t *@var{dest}, const uint8_t *@var{src}) @@ -338,6 +349,7 @@ This function is similar to @posixfunc{strncat} and @posixfunc{wcsncat}, except that it operates on Unicode strings. @end deftypefun +@cindex comparing The following functions compare two Unicode strings. @deftypefun int u8_strcmp (const uint8_t *@var{s1}, const uint8_t *@var{s2}) @@ -352,6 +364,7 @@ This function is similar to @posixfunc{strcmp} and @posixfunc{wcscmp}, except that it operates on Unicode strings. @end deftypefun +@cindex comparing, with collation rules @deftypefun int u8_strcoll (const uint8_t *@var{s1}, const uint8_t *@var{s2}) @deftypefunx int u16_strcoll (const uint16_t *@var{s1}, const uint16_t *@var{s2}) @deftypefunx int u32_strcoll (const uint32_t *@var{s1}, const uint32_t *@var{s2}) @@ -377,6 +390,7 @@ This function is similar to @posixfunc{strncmp} and @posixfunc{wcsncmp}, except that it operates on Unicode strings. @end deftypefun +@cindex duplicating The following function allocates a duplicate of a Unicode string. @deftypefun {uint8_t *} u8_strdup (const uint8_t *@var{s}) @@ -388,6 +402,7 @@ This function is similar to @posixfunc{strdup} and @posixfunc{wcsdup}, except that it operates on Unicode strings. @end deftypefun +@cindex searching, for a character The following functions search for a given Unicode character. @deftypefun {uint8_t *} u8_strchr (const uint8_t *@var{str}, ucs4_t @var{uc}) @@ -440,6 +455,7 @@ This function is similar to @posixfunc{strpbrk} and @posixfunc{wcspbrk}, except that it operates on Unicode strings. @end deftypefun +@cindex searching, for a substring The following functions search whether a given Unicode string is a substring of another Unicode string. diff --git a/doc/uniwbrk.texi b/doc/uniwbrk.texi index 4c1a2a1..7b081fb 100644 --- a/doc/uniwbrk.texi +++ b/doc/uniwbrk.texi @@ -1,6 +1,8 @@ @node uniwbrk.h @chapter Word breaks in strings @code{<uniwbrk.h>} +@cindex word breaks +@cindex breaks, word This include file declares functions for determining where in a string ``words'' start and end. Here ``words'' are not necessarily the same as entities that can be looked up in dictionaries, but rather groups of diff --git a/doc/uniwidth.texi b/doc/uniwidth.texi index 8c53d04..a05d101 100644 --- a/doc/uniwidth.texi +++ b/doc/uniwidth.texi @@ -1,10 +1,12 @@ @node uniwidth.h @chapter Display width @code{<uniwidth.h>} +@cindex width This include file declares functions that return the display width, measured in columns, of characters or strings, when output to a device that uses non-proportional fonts. +@cindex ambiguous width Note that for some rarely used characters the actual fonts or terminal emulators can use a different width. There is no mechanism for communicating the display width of characters across a Unix pseudo-terminal (tty). Also, @@ -16,6 +18,9 @@ most characters but can fail to represent the actual display width. These functions are locale dependent. The @var{encoding} argument identifies the encoding (e.g@. @code{"ISO-8859-2"} for Polish). +@cindex Unicode character, width +@cindex halfwidth +@cindex fullwidth @deftypefun int uc_width (ucs4_t @var{uc}, const char *@var{encoding}) Determines and returns the number of column positions required for @var{uc}. Returns -1 if @var{uc} is a control character that has an influence on the |