summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorTaylor R Campbell <campbell+cairographics.org@mumble.net>2022-04-05 12:51:33 +0000
committerTaylor R Campbell <campbell+cairographics.org@mumble.net>2023-03-29 09:04:46 +0000
commitb1ccd521835abd2af2c9833f2b155e32d3ee282d (patch)
treefd0341f090a935da89ff5c0a4a6111e676742b9b
parentd2b44eccb7d72fd8a2c0c511682ea3cc7f572179 (diff)
downloadcairo-b1ccd521835abd2af2c9833f2b155e32d3ee282d.tar.gz
Avoid misuse of ctype(3) functions
The ctype(3) character classification and mapping functions have a peculiarly limited definition (C11, Sec. 7.4 `Character handling <ctype.h>', p. 200): `The header <ctype.h> declares several functions useful for classifying and mapping characters. In all cases the argument is an int, the value of which shall be representable as an unsigned char or shall equal the value of the macro EOF. If the argument has any other value, the behavior is undefined.' In other words, in the most common case of 8-bit char and EOF = -1, the domain of the 257 allowed arguments is: -1, 0, 1, 2, ..., 254, 255 The ctype(3) functions are designed for use with stdio functions like getchar and fgetc which return int values in the same domain. In an ABI where char is signed (e.g., x86 SysV ABI used by most Unixish operating systems), passing an argument of type char as is can go wrong in two ways: 1. The value of a non-EOF input octet interpreted as `char' may coincide, as an integer, with the value of EOF, leading to wrong answers for some non-EOF inputs. E.g., if EOF = 1, and an input octet has all bits set, i.e., 255 as an unsigned char, then as a char the value is -1, which will be confused with EOF. In the ISO-8859-1 locale, the code point 255 is (in Unicode terminology) LATIN SMALL LETTER Y WITH DIAERESIS, for which isprint, isalpha, &c., are true. But isprint, isalpha, &c., are false for EOF. So if char *s points to a string with that character, isprint(*s) will return false when it should return true. 2. Passing a negative char whose value does not coincide with EOF is undefined behaviour. This isn't purely theoretical: often the functions are implemented by an array lookup, #define isprint(c) (ctypetab[c] & ISPRINT). If c is out of range (e.g., 192, ISO-8859-1 for LATIN CAPITAL LETTER A WITH GRAVE, which convers to (signed) char as -64), then you can get garbage answers by reading uninitialized memory or application crashes with SIGSEGV if the page preceding the table is unmapped. If what you have is an arbitrary char (e.g., from a char * string pointing at user input), then the only correct way to use the ctype(3) functions is by converting to unsigned char first -- e.g., isprint((unsigned char)*s). (If the functions were defined as macros that convert to unsigned char first, they would then spuriously interpret EOF as a non-EOF, so they can't do that themselves.) It is possible, in some cases, to prove that the input always actually lies in {0, 1, 2, ..., 127}, so the conversion to unsigned char is not necessary. I didn't check whether this was the case -- it's safer to just adopt the habit of always casting char to unsigned char first before using the ctype(3) macros, which satisfies a compiler warning on some systems designed to detect this class of application errors at compile-time.
-rw-r--r--perf/cairo-analyse-trace.c4
-rw-r--r--perf/cairo-perf-trace.c4
-rw-r--r--test/cairo-test-trace.c4
3 files changed, 6 insertions, 6 deletions
diff --git a/perf/cairo-analyse-trace.c b/perf/cairo-analyse-trace.c
index 6dbe7cf4b..dda33a2e0 100644
--- a/perf/cairo-analyse-trace.c
+++ b/perf/cairo-analyse-trace.c
@@ -290,11 +290,11 @@ read_excludes (cairo_perf_t *perf,
/* whitespace delimits */
s = line;
- while (*s != '\0' && isspace (*s))
+ while (*s != '\0' && isspace ((unsigned char)*s))
s++;
t = s;
- while (*t != '\0' && ! isspace (*t))
+ while (*t != '\0' && ! isspace ((unsigned char)*t))
t++;
if (s != t) {
diff --git a/perf/cairo-perf-trace.c b/perf/cairo-perf-trace.c
index cfabcaad0..991a8a5e0 100644
--- a/perf/cairo-perf-trace.c
+++ b/perf/cairo-perf-trace.c
@@ -407,11 +407,11 @@ read_excludes (cairo_perf_t *perf,
/* whitespace delimits */
s = line;
- while (*s != '\0' && isspace (*s))
+ while (*s != '\0' && isspace ((unsigned char)*s))
s++;
t = s;
- while (*t != '\0' && ! isspace (*t))
+ while (*t != '\0' && ! isspace ((unsigned char)*t))
t++;
if (s != t) {
diff --git a/test/cairo-test-trace.c b/test/cairo-test-trace.c
index 3ca82c4b7..78e822cf1 100644
--- a/test/cairo-test-trace.c
+++ b/test/cairo-test-trace.c
@@ -1515,11 +1515,11 @@ read_excludes (test_trace_t *test, const char *filename)
/* whitespace delimits */
s = line;
- while (*s != '\0' && isspace (*s))
+ while (*s != '\0' && isspace ((unsigned char)*s))
s++;
t = s;
- while (*t != '\0' && ! isspace (*t))
+ while (*t != '\0' && ! isspace ((unsigned char)*t))
t++;
if (s != t) {