@node The wchar_t mess @appendix The @code{wchar_t} mess @cindex wchar_t, type The ISO C and POSIX standard creators made an attempt to fix the first problem mentioned in the section @ref{char * strings}. They introduced @itemize @bullet @item a type @samp{wchar_t}, designed to encapsulate an entire character, @item a ``wide string'' type @samp{wchar_t *}, and @item functions declared in @posixheader{wctype.h} that were meant to supplant the ones in @posixheader{ctype.h}. @end itemize Unfortunately, this API and its implementation has numerous problems: @itemize @bullet @item On AIX and Windows platforms, @code{wchar_t} is a 16-bit type. This means that it can never accommodate an entire Unicode character. Either the @code{wchar_t *} strings are limited to characters in UCS-2 (the ``Basic Multilingual Plane'' of Unicode), or --- if @code{wchar_t *} strings are encoded in UTF-16 --- a @code{wchar_t} represents only half of a character in the worst case, making the @posixheader{wctype.h} functions pointless. @item On Solaris and FreeBSD, the @code{wchar_t} encoding is locale dependent and undocumented. This means, if you want to know any property of a @code{wchar_t} character, other than the properties defined by @posixheader{wctype.h} --- such as whether it's a dash, currency symbol, paragraph separator, or similar ---, you have to convert it to @code{char *} encoding first, by use of the function @posixfunc{wctomb}. @item When you read a stream of wide characters, through the functions @posixfunc{fgetwc} and @posixfunc{fgetws}, and when the input stream/file is not in the expected encoding, you have no way to determine the invalid byte sequence and do some corrective action. If you use these functions, your program becomes ``garbage in - more garbage out'' or ``garbage in - abort''. @end itemize As a consequence, it is better to use multibyte strings, as explained in the section @ref{char * strings}. Such multibyte strings can bypass limitations of the @code{wchar_t} type, if you use functions defined in gnulib and libunistring for text processing. They can also faithfully transport malformed characters that were present in the input, without requiring the program to produce garbage or abort.