1Copyright (C) 2000, 2003 Free Software Foundation, Inc.
2
3This file is intended to contain a few notes about writing C code
4within GCC so that it compiles without error on the full range of
5compilers GCC needs to be able to compile on.
6
7The problem is that many ISO-standard constructs are not accepted by
8either old or buggy compilers, and we keep getting bitten by them.
9This knowledge until know has been sparsely spread around, so I
10thought I'd collect it in one useful place.  Please add and correct
11any problems as you come across them.
12
13I'm going to start from a base of the ISO C90 standard, since that is
14probably what most people code to naturally.  Obviously using
15constructs introduced after that is not a good idea.
16
17For the complete coding style conventions used in GCC, please read
18http://gcc.gnu.org/codingconventions.html
19
20
21String literals
22---------------
23
24Irix6 "cc -n32" and OSF4 "cc" have problems with constant string
25initializers with parens around it, e.g.
26
27const char string[] = ("A string");
28
29This is unfortunate since this is what the GNU gettext macro N_
30produces.  You need to find a different way to code it.
31
32Some compilers like MSVC++ have fairly low limits on the maximum
33length of a string literal; 509 is the lowest we've come across.  You
34may need to break up a long printf statement into many smaller ones.
35
36
37Empty macro arguments
38---------------------
39
40ISO C (6.8.3 in the 1990 standard) specifies the following:
41
42If (before argument substitution) any argument consists of no
43preprocessing tokens, the behavior is undefined.
44
45This was relaxed by ISO C99, but some older compilers emit an error,
46so code like
47
48#define foo(x, y) x y
49foo (bar, )
50
51needs to be coded in some other way.
52
53
54free and realloc
55----------------
56
57Some implementations crash upon attempts to free or realloc the null
58pointer.  Thus if mem might be null, you need to write
59
60  if (mem)
61    free (mem);
62
63
64Trigraphs
65---------
66
67You weren't going to use them anyway, but some otherwise ISO C
68compliant compilers do not accept trigraphs.
69
70
71Suffixes on Integer Constants
72-----------------------------
73
74You should never use a 'l' suffix on integer constants ('L' is fine),
75since it can easily be confused with the number '1'.
76
77
78			Common Coding Pitfalls
79			======================
80
81errno
82-----
83
84errno might be declared as a macro.
85
86
87Implicit int
88------------
89
90In C, the 'int' keyword can often be omitted from type declarations.
91For instance, you can write
92
93  unsigned variable;
94
95as shorthand for
96
97  unsigned int variable;
98
99There are several places where this can cause trouble.  First, suppose
100'variable' is a long; then you might think
101
102  (unsigned) variable
103
104would convert it to unsigned long.  It does not.  It converts to
105unsigned int.  This mostly causes problems on 64-bit platforms, where
106long and int are not the same size.
107
108Second, if you write a function definition with no return type at
109all:
110
111  operate (int a, int b)
112  {
113    ...
114  }
115
116that function is expected to return int, *not* void.  GCC will warn
117about this.
118
119Implicit function declarations always have return type int.  So if you
120correct the above definition to
121
122  void
123  operate (int a, int b)
124  ...
125
126but operate() is called above its definition, you will get an error
127about a "type mismatch with previous implicit declaration".  The cure
128is to prototype all functions at the top of the file, or in an
129appropriate header.
130
131Char vs unsigned char vs int
132----------------------------
133
134In C, unqualified 'char' may be either signed or unsigned; it is the
135implementation's choice.  When you are processing 7-bit ASCII, it does
136not matter.  But when your program must handle arbitrary binary data,
137or fully 8-bit character sets, you have a problem.  The most obvious
138issue is if you have a look-up table indexed by characters.
139
140For instance, the character '\341' in ISO Latin 1 is SMALL LETTER A
141WITH ACUTE ACCENT.  In the proper locale, isalpha('\341') will be
142true.  But if you read '\341' from a file and store it in a plain
143char, isalpha(c) may look up character 225, or it may look up
144character -31.  And the ctype table has no entry at offset -31, so
145your program will crash.  (If you're lucky.)
146
147It is wise to use unsigned char everywhere you possibly can.  This
148avoids all these problems.  Unfortunately, the routines in <string.h>
149take plain char arguments, so you have to remember to cast them back
150and forth - or avoid the use of strxxx() functions, which is probably
151a good idea anyway.
152
153Another common mistake is to use either char or unsigned char to
154receive the result of getc() or related stdio functions.  They may
155return EOF, which is outside the range of values representable by
156char.  If you use char, some legal character value may be confused
157with EOF, such as '\377' (SMALL LETTER Y WITH UMLAUT, in Latin-1).
158The correct choice is int.
159
160A more subtle version of the same mistake might look like this:
161
162  unsigned char pushback[NPUSHBACK];
163  int pbidx;
164  #define unget(c) (assert(pbidx < NPUSHBACK), pushback[pbidx++] = (c))
165  #define get(c) (pbidx ? pushback[--pbidx] : getchar())
166  ...
167  unget(EOF);
168
169which will mysteriously turn a pushed-back EOF into a SMALL LETTER Y
170WITH UMLAUT.
171
172
173Other common pitfalls
174---------------------
175
176o Expecting 'plain' char to be either sign or unsigned extending.
177
178o Shifting an item by a negative amount or by greater than or equal to
179  the number of bits in a type (expecting shifts by 32 to be sensible
180  has caused quite a number of bugs at least in the early days).
181
182o Expecting ints shifted right to be sign extended.
183
184o Modifying the same value twice within one sequence point.
185
186o Host vs. target floating point representation, including emitting NaNs
187  and Infinities in a form that the assembler handles.
188
189o qsort being an unstable sort function (unstable in the sense that
190  multiple items that sort the same may be sorted in different orders
191  by different qsort functions).
192
193o Passing incorrect types to fprintf and friends.
194
195o Adding a function declaration for a module declared in another file to
196  a .c file instead of to a .h file.
197
198