underscores and undefined behaviorMay 11, 2011 · 2 minute read · Comments
As everyone should know, underscores in C are not cool, as they cause undefined behavior per 7.1.3:
All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use. [...] If the program declares or defines an identifier in a context in which it is reserved (other than as allowed by 7.1.4), or defines a reserved identifier as a macro name, the behavior is undefined.
Yet, they are widely used everywhere. Here are some examples:
inclusion guards in GLib:
internal Python functions:
various macros in APT:
All of this triggers undefined behavior and is thus uncool. Of course in APT, it’s most stupid, as we do not have any namespace and could thus end up redefining things we should not much more likely then the other two.
But why were those solutions chosen in the first place, and what is the alternative? I cannot answer the first question, but for the second one, the obvious alternative is to use trailing underscores:
inclusion guards, defined behavior:
internal functions, defined behavior:
various macros, defined behavior:
Then there is another class of reserved identifiers with underscores:
All identifiers that begin with an underscore are always reserved for use as identifiers with file scope in both the ordinary and tag name spaces.
Meaning that everything except for parameters, local variables and members of structs/unions that starts with an underscore is reserved. So, if you happen to create a variable
_mylibrary_debug_flag, you trigger undefined behavior as well. And while we’re at it, do not think you can create a type ending in
_t: POSIX reserves all identifiers ending in
_t for its own use.
In summary, whenever you write C and want to be 100% safe of undefined-behavior-because-of-naming, do not start any identifier with an underscore and do not end any identifier with