diff options
author | neil <neil@138bc75d-0d04-0410-961f-82ee72b054a4> | 2000-07-15 02:56:01 +0000 |
---|---|---|
committer | neil <neil@138bc75d-0d04-0410-961f-82ee72b054a4> | 2000-07-15 02:56:01 +0000 |
commit | e15e66e7390aef4e38d576f63daf977dcc6b8665 (patch) | |
tree | 65b2f5fe89915a36b0f4ab7637d521a1ca87bb88 /gcc/README.Portability | |
parent | e6ac7206b9535a4d1fb77151b91285e32d72cc56 (diff) | |
download | gcc-e15e66e7390aef4e38d576f63daf977dcc6b8665.tar.gz |
* README.Portability: New file.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@35039 138bc75d-0d04-0410-961f-82ee72b054a4
Diffstat (limited to 'gcc/README.Portability')
-rw-r--r-- | gcc/README.Portability | 369 |
1 files changed, 369 insertions, 0 deletions
diff --git a/gcc/README.Portability b/gcc/README.Portability new file mode 100644 index 00000000000..1c2df58edad --- /dev/null +++ b/gcc/README.Portability @@ -0,0 +1,369 @@ +Copyright (C) 2000 Free Software Foundation, Inc. + +This file is intended to contain a few notes about writing C code +within GCC so that it compiles without error on the full range of +compilers GCC needs to be able to compile on. + +The problem is that many ISO-standard constructs are not accepted by +either old or buggy compilers, and we keep getting bitten by them. +This knowledge until know has been sparsely spread around, so I +thought I'd collect it in one useful place. Please add and correct +any problems as you come across them. + +I'm going to start from a base of the ISO C89 standard, since that is +probably what most people code to naturally. Obviously using +constructs introduced after that is not a good idea. + +The first section of this file deals strictly with portability issues, +the second with common coding pitfalls. + + + Portability Issues + ================== + +Unary + +------- + +K+R C compilers and preprocessors have no notion of unary '+'. Thus +the following code snippet contains 2 portability problems. + +int x = +2; /* int x = 2; */ +#if +1 /* #if 1 */ +#endif + + +Pointers to void +---------------- + +K+R C compilers did not have a void pointer, and used char * as the +pointer to anything. The macro PTR is defined as either void * or +char * depending on whether you have a standards compliant compiler or +a K+R one. Thus + + free ((void *) h->value.expansion); + +should be written + + free ((PTR) h->value.expansion); + + +String literals +--------------- + +K+R C did not allow concatenation of string literals like + + "This is a " "single string literal". + +Moreover, some compilers like MSVC++ have fairly low limits on the +maximum length of a string literal; 509 is the lowest we've come +across. You may need to break up a long printf statement into many +smaller ones. + + +Empty macro arguments +--------------------- + +ISO C (6.8.3 in the 1990 standard) specifies the following: + +If (before argument substitution) any argument consists of no +preprocessing tokens, the behavior is undefined. + +This was relaxed by ISO C99, but some older compilers emit an error, +so code like + +#define foo(x, y) x y +foo (bar, ) + +needs to be coded in some other way. + + +signed keyword +-------------- + +The signed keyword did not exist in K+R comilers, it was introduced in +ISO C89, so you cannot use it. In both K+R and standard C, +unqualified char and bitfields may be signed or unsigned. There is no +way to portably declare signed chars or signed bitfields. + +All other arithmetic types are signed unless you use the 'unsigned' +qualifier. For instance, it is safe to write + + short paramc; + +instead of + + signed short paramc; + +If you have an algorithm that depends on signed char or signed +bitfields, you must find another way to write it before it can be +integrated into GCC. + + +Function prototypes +------------------- + +You need to provide a function prototype for every function before you +use it, and functions must be defined K+R style. The function +prototype should use the PARAMS macro, which takes a single argument. +Therefore the parameter list must be enclosed in parentheses. For +example, + +int myfunc PARAMS ((double, int *)); + +int +myfunc (var1, var2) + double var1; + int *var2; +{ + ... +} + +You also need to use PARAMS when referring to function protypes in +other circumstances, for example see "Calling functions through +pointers to functions" below. + +Variable-argument functions are best described by example:- + +void cpp_ice PARAMS ((cpp_reader *, const char *msgid, ...)); + +void +cpp_ice VPARAMS ((cpp_reader *pfile, const char *msgid, ...)) +{ +#ifndef ANSI_PROTOTYPES + cpp_reader *pfile; + const char *msgid; +#endif + va_list ap; + + VA_START (ap, msgid); + +#ifndef ANSI_PROTOTYPES + pfile = va_arg (ap, cpp_reader *); + msgid = va_arg (ap, const char *); +#endif + + ... + va_end (ap); +} + +For the curious, here are the definitions of the above macros. See +ansidecl.h for the definitions of the above macros and more. + +#define PARAMS(paramlist) paramlist /* ISO C. */ +#define VPARAMS(args) args + +#define PARAMS(paramlist) () /* K+R C. */ +#define VPARAMS(args) (va_alist) va_dcl + + +Calling functions through pointers to functions +----------------------------------------------- + +K+R C compilers require brackets around the dereferenced pointer +variable. For example + +typedef void (* cl_directive_handler) PARAMS ((cpp_reader *, const char *)); + p->handler (pfile, p->arg); + +needs to become + + (p->handler) (pfile, p->arg); + + +Macros +------ + +The rules under K+R C and ISO C for achieving stringification and +token pasting are quite different. Therefore some macros have been +defined which will get it right depending upon the compiler. + + CONCAT2(a,b) CONCAT3(a,b,c) and CONCAT4(a,b,c,d) + +will paste the tokens passed as arguments. You must not leave any +space around the commas. Also, + + STRINGX(x) + +will stringify an argument; to get the same result on K+R and ISO +compilers x should not have spaces around it. + + +Enums +----- + +In K+R C, you have to cast enum types to use them as integers, and +some compilers in particular give lots of warnings for using an enum +as an array index. + +Bitfields +--------- + +See also "signed keyword" above. In K+R C only unsigned int bitfields +were defined (i.e. unsigned char, unsigned short, unsigned long. +Using plain int/short/long was not allowed). + + +free and realloc +---------------- + +Some implementations crash upon attempts to free or realloc the null +pointer. Thus if mem might be null, you need to write + + if (mem) + free (mem); + + +Reserved Keywords +----------------- + +K+R C has "entry" as a reserved keyword, so you should not use it for +your variable names. + + +Type promotions +--------------- + +K+R used unsigned-preserving rules for arithmetic expresssions, while +ISO uses value-preserving. This means an unsigned char compared to an +int is done as an unsigned comparison in K+R (since unsigned char +promotes to unsigned) while it is signed in ISO (since all of the +values in unsigned char fit in an int, it promotes to int). + +** Not having any argument whose type is a short type (char, short, +float of any flavor) and subject to promotion. ** + +Trigraphs +--------- + +You weren't going to use them anyway, but trigraphs were not defined +in K+R C, and some otherwise ISO C compliant compilers do not accept +them. + + +Suffixes on Integer Constants +----------------------------- + +**Using a 'u' suffix on integer constants.** + + +errno +----- + +errno might be declared as a macro. + + + Common Coding Pitfalls + ====================== +Implicit int +------------ + +In C, the 'int' keyword can often be omitted from type declarations. +For instance, you can write + + unsigned variable; + +as shorthand for + + unsigned int variable; + +There are several places where this can cause trouble. First, suppose +'variable' is a long; then you might think + + (unsigned) variable + +would convert it to unsigned long. It does not. It converts to +unsigned int. This mostly causes problems on 64-bit platforms, where +long and int are not the same size. + +Second, if you write a function definition with no return type at +all: + + operate(a, b) + int a, b; + { + ... + } + +that function is expected to return int, *not* void. GCC will warn +about this. K+R C has no problem with 'void' as a return type, so you +need not worry about that. + +Implicit function declarations always have return type int. So if you +correct the above definition to + + void + operate(a, b) + int a, b; + ... + +but operate() is called above its definition, you will get an error +about a "type mismatch with previous implicit declaration". The cure +is to prototype all functions at the top of the file, or in an +appropriate header. + +Char vs unsigned char vs int +---------------------------- + +In C, unqualified 'char' may be either signed or unsigned; it is the +implementation's choice. When you are processing 7-bit ASCII, it does +not matter. But when your program must handle arbitrary binary data, +or fully 8-bit character sets, you have a problem. The most obvious +issue is if you have a look-up table indexed by characters. + +For instance, the character '\341' in ISO Latin 1 is SMALL LETTER A +WITH ACUTE ACCENT. In the proper locale, isalpha('\341') will be +true. But if you read '\341' from a file and store it in a plain +char, isalpha(c) may look up character 225, or it may look up +character -31. And the ctype table has no entry at offset -31, so +your program will crash. (If you're lucky.) + +It is wise to use unsigned char everywhere you possibly can. This +avoids all these problems. Unfortunately, the routines in <string.h> +take plain char arguments, so you have to remember to cast them back +and forth - or avoid the use of strxxx() functions, which is probably +a good idea anyway. + +Another common mistake is to use either char or unsigned char to +receive the result of getc() or related stdio functions. They may +return EOF, which is outside the range of values representable by +char. If you use char, some legal character value may be confused +with EOF, such as '\377' (SMALL LETTER Y WITH UMLAUT, in Latin-1). +The correct choice is int. + +A more subtle version of the same mistake might look like this: + + unsigned char pushback[NPUSHBACK]; + int pbidx; + #define unget(c) (assert(pbidx < NPUSHBACK), pushback[pbidx++] = (c)) + #define get(c) (pbidx ? pushback[--pbidx] : getchar()) + ... + unget(EOF); + +which will mysteriously turn a pushed-back EOF into a SMALL LETTER Y +WITH UMLAUT. + + +Other common pitfalls +--------------------- + +o Expecting 'plain' char to be either sign or unsigned extending + +o Shifting an item by a negative amount or by greater than or equal to + the number of bits in a type (expecting shifts by 32 to be sensible + has caused quite a number of bugs at least in the early days). + +o Expecting ints shifted right to be sign extended. + +o Modifying the same value twice within one sequence point. + +o Host vs. target floating point representation, including emitting NaNs + and Infinities in a form that the assembler handles. + +o qsort being an unstable sort function (unstable in the sense that + multiple items that sort the same may be sorted in different orders + by different qsort functions). + +o Passing incorrect types to fprintf and friends. + +o Adding a function declaration for a module declared in another file to + a .c file instead of to a .h file. |