Jens Gustedt's Blog

November 7, 2010

Don’t use NULL

Filed under: C99, integers, syntax — Jens Gustedt @ 23:36

I always thought that using NULL whenever I wanted to assign a null pointer value to a pointer was a good thing, but today I learned the contrary.

The standard, 6.3.2.3, says, seemingly innocent

An integer constant expression with the value 0, or such an expression cast to type void*, is called a null pointer constant.

An integer constant expression is something that evaluates at compile time and is of an integer type. Integer types are all the standard integer types that we all know, including plain char and _Bool, and enumeration types. In particular such expressions may contain casts to such types.

So far this all sounds relatively innocent if you think of the primary use of null pointer constants namely to be assigned to pointer variables, but see below.

Then later (7.17) is defined the macro

NULL which expands to an implementation-defined null pointer constant

This means that NULL may result in at least 12 different standard integer types and many more expressions that result in a null value in that type:

(_Bool)0
(char)0
(unsigned char)0
(signed char)0
(unsigned short)0
(signed short)0
(unsigned)0 == 0U
(signed)0 == 0 == '\0' == enumeration_constant_of_value_0
(unsigned long)0 == 0UL
(signed long)0 == 0L
(unsigned long long)0 == 0ULL
(signed long)0 == 0LL

and in

(void*)0

and for any enumeration type enum etag or any typedef tdef of integer type in

(enum etag)0, (tdef)0.

What is important here that most of these expressions are only interpreted as null pointer constants when they appear in what some people call a pointer context that is in a context where the compiler expects a pointer value. But strictly speaking C, doesn’t know such a thing like a pointer or integer context.

This becomes harmful when you pass arguments to va_arg functions. Here most of the arguments are just placed on the stack as they are, only small integer types are promoted to int. Now consider a harmless function such as printf that at some point expects a void* pointer:

printf("%d (%p) %d", 1, NULL, 2);

Looks fine? But it isn’t at all. As we have seen depending on the mood of your compiler provider NULL here may be many things: something what most people expect, namely (void*)0, but also 0L or 0LL. The call to printf might simply expect the three arguments on wrong places on the stack and crash your program.

Now people seriously propose the following work around for that

printf("%d (%p) %d", 1, (void*)NULL, 2);

But wasn’t the only justification of the macro NULL just that it should remind us that there is a pointer type expected at a certain place? But this fact is already clear from the cast to void*. Why not just use

printf("%d (%p) %d", 1, (void*)0, 2);

that has the same message?

In summary,

  • NULL obfuscates an expression that can be either of integer type or of pointer type.
  • Whenever you want to initialize or assign to a pointer to a null value, use plain 0, it does serve well.
  • Whenever you use a null value in a function call for a parameter that has no prototype use (T*)0.

Ah, and yes, an enumeration casted 0‘s should be interpreted as null pointer constants:

enum awk { a };
void* q = (enum awk)0;

clang accepts this correctly. gcc and opencc don’t.

About these ads

6 Comments »

  1. 0 is not that good either: http://mina86.com/2010/10/24/0-is-ambiguous/

    Comment by Thomas — November 26, 2010 @ 08:23

    • No I don’t agree with the blog that you link to. Already most of it applies to C++, and not to C. But I think that the idea of nullptr as such is not such a bad idea. I agree that the cavalresque manner of imposing again a new conflicting keyword is really annoying. If they would be nice guys that cared about the rest of the world they would use _Nullptr as a keyword and put a macro nullptr in something like “stdnull.h”.

      But for 0 itself, I don’t agree. According to the C99 standard 0 is always an integer constant. (Yes it is an octal constant, but who cares :) It may implicitly cast to a “null pointer constant” only because it is a constant integer expression of value 0. But it never is a “null pointer” by itself.

      Now the example that is given there: printf("%p\n", 0). Integer promotion rules define what has to happen: it pushes an int on the stack. printf expects a void*. This is clearly undefined behavior as soon as both types differ in width, which they do on many platforms nowadays.

      Comment by Jens Gustedt — November 26, 2010 @ 09:09

  2. (enum awk)0 is not a “integer constant expression with the value 0, or such an expression cast to type void*”, so if clang accepts it without a diagnostic, then thats a compiler extension. gcc behaviour is correct.

    see 6.4.4 for the deifnition of integer constant.

    Comment by anonymous — June 10, 2012 @ 14:31

    • Usually I wouldn’t allow for anonymous comments, but this one is important so I’ll correct it.

      (enum awk)0 is not an “integer constant”, yes, but confusingly it is a “integer constant expression” as defined by the standard. The relevant explanation for that is in 6.6p6:

      An integer constant expression shall have integer type and shall only have operands
      that are integer constants, enumeration constants, character constants, sizeof
      expressions whose results are integer constants, and floating constants that are the
      immediate operands of casts.

      Integer types include enumerations, 0 is an integer constant and the expression is an immediate operand of a cast.

      Comment by Jens Gustedt — June 10, 2012 @ 14:50

  3. I do agree with what you’ve said in that zero will suffice in every situation where a null pointer value is required.

    However I do still believe however that there is significant value in expressing it as “NULL” instead of “0”, as did the inventors of the language since they said so in their first very concise description of the language. Quoting from the 1978 1st edition of K&R, page 97,98:

    “We write NULL instead of zero, however, to indicate more clearly that this is a special value for a pointer.”

    I just wish NULL had been defined as an anonymous enum with the value of zero, instead of as a preprocessor macro. (It is very sad that a couple of rather important features of C, including enums, were not implemented until a few months after the first edition of the book had already gone to press. If the internet had existed at the time then others studying and re-implementing the language might have known about these features far sooner. Unfortunately the Nov. 15, 1978 memorandum describing them did not reach anywhere near as wide an audience as the book did.) I also really despise the idiots in the standards committees who have allowed the alternative definition of NULL as “(void *) 0″. Perhaps it was well meant, but in the end I believe it has caused no end of confusion and brain damage to endless numbers of programmers.

    As you’ve said in your more recent essay arguing against casts one should never cast NULL to the type of the variable in an assignment — rather one should always use implicit casts in true _assignments_.

    However that doesn’t mean one will never have to cast either NULL or zero to a specific pointer type in some situations.

    It is very important to keep in mind that pointers may have different representations depending on what they are used to point to (even though the language explicitly requires all pointer types to have the same representation of a null pointer, and that the width of a null void pointer should be as wide as the widest pointer type).

    So, as you point out it becomes critical to also use a cast when you’re using either NULL or zero as a function parameter value for a parameter which is expected to be a pointer value, and especially when this is done in a call to a function taking variable numbers of parameters which will be interpreted at runtime (though this is equally important when a prototype is not in the scope of the call).

    While “(void *) 0″ does obviously mean the value is to be expressed a pointer and that it is an expression of the representation of a null pointer, I would suggest that the idiom of using “NULL” in C is strong and wide spread and important enough that is a good idea to continue that tradition, and so I argue against the advice in your essay above.

    Comment by Greg A. Woods — April 19, 2014 @ 21:02

    • I basically agree with your analysis that it would be nice to have NULL as a special null pointer constant to clearly indicate the semantic. I just think that it is doing more harm, than it would do good. And I also am aware that it is a widely used idiom.

      In my opinion C++ went the right way in switching to a new concept, namely nullptr. To repair this mess, C could easily evolve by adding two things:

      • add nullptr to the language
      • add a new default argument promotion rule for all pointer types that converts them to void*

      These two rules together would allow to have code>nullptr just to be a define for ((void*)0).

      • This would fix the problem that occurs when null pointers are used as sentinel for va_arg functions.
      • Functions that pass other pointer values into va_arg functions should still continue to work. There arguments would just be converted to void* and then inside the function back to the desired type.

      Comment by Jens Gustedt — April 19, 2014 @ 22:23


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Silver is the New Black Theme. Get a free blog at WordPress.com

Follow

Get every new post delivered to your Inbox.

Join 27 other followers

%d bloggers like this: