testing compile time constness and null pointers with C11’s _Generic

Sometimes in C it is useful to distinguish if an expression is an “integral constant expression” or a “null pointer constant”. E.g for an object that is allocated statically, only such expressions are valid initializers. Usually we are able to determine that directly when writing an initializer, but if we want to initialize a more complicated struct with a function like initializer macro, with earlier versions of C we have the choice:

  • Use a compiler extension such as gcc’s __builtin_constant_p
  • We’d have to write two different versions of such a macro, one for static allocation and one for automatic.

In the following I will explain how to achieve such a goal with C11’s _Generic feature. I am not aware of a C++ feature that provides the same possibilities. Also, this uses the ternary operator (notably different in C and C++), so readers that merely come from that community should read the following with precaution.

Continue reading “testing compile time constness and null pointers with C11’s _Generic”


surprising occurrence of identifiers in header files

I remember being stuck sometime ago because a system header at the time on the platform that I was using defined the undocumented identifier barrier. IIRC this even was a macro, which made the bug really hard to track, seemingly harmless code simply exploded.

Hopefully nowadays platform implementors are a bit more careful in not polluting the namespace, but still avoiding naming conflicts is not so easy. E.g inline functions are a useful tool when you want to expose small functions to all compilation units of a program. There is one pitfall, though, when it comes to naming conventions for their parameter names and local variables. If you get the name wrong, as in this simple example

inline double my_sin(double PHI) { return sinf(PHI); } 

other users of your code might encounter random problems if they define a macro PHI.
Continue reading “surprising occurrence of identifiers in header files”

Emulating C11 compiler features with gcc: _Atomic

The new support for atomic operations in C11 is probably the most useful one. Support for atomic instructions is present in all commodity processors since at least 20 years, but a standardized interface in one of the main programming languages was really missing. Up to now you always had to implement stubs for these operations in assembler. This by itself was not as difficult for a given platform, but writing platform independent code quickly became tedious.

P99 now has an emulation of parts of these features that allow you to use them in a preview for C11. This implementation mainly uses (again) extensions of the gcc family. It should even work for older versions of gcc that don’t implement their __sync_.. built-in.

Continue reading “Emulating C11 compiler features with gcc: _Atomic”

Handling control flow inside macros

When people program macros that they want to be executed anywhere where a statement could be, they often use the

do {...} while(0)

construct. This construct has a big disadvantage in that it may change control flow in an unexpected way when you’d use it as a generic macro tool to collect statements:
Continue reading “Handling control flow inside macros”

P99 is released

P99 – Preprocessor macros and functions for C99

P99 is a toolbox of macro and function definitions that ease the programming in modern C, aka C99. By using new tools from C99 it implements default arguments for functions, scope bound resource management, transparent allocation and initialization, …

The complexity of the tools ranges from very simple (but convenient) macros such as P99_INIT to relatively complex ones such as P99_UNWIND_PROTECT.

P99 is not a library but just a set of include files. You may include the whole by just using “p99.h” or cherry pick individual parts to your needs. You will not have to link against a special library the “only” prerequisite is that your compiler supports modern C, aka C99, to a wide extent.

So far I have tested P99

  • on linux systems
  • with INTEL 32 / 64 bit and ARM processors
  • with four different compilers: gcc, clang, opencc and icc
  • with code from an internal project.

If you are developing for another setting I would be very much curious to hear of your experience with P99.

P99 can be downloaded at p99.gforge.inria.fr. It is licensed under the QPL.

Macros versus inline functions

Functions (whether inline or not) and macros fulfill different purposes. Their difference should not be seen as ideological as some seem to take it, and what is even more important, they may work nicely together.

Macros are text replacement that is done at compile time and they can do things like

#define P99_ISSIGNED(T) ((T)-1 < (T)0)

which gives you a compile time expression of whether or not an integral type is signed or not. That is, they are ideally used when the type of an expression is not known (at the definition) and you want to do something about it. On the other hand, the pitfall with macros is that their arguments may be evaluated several times, which is bad because of side effects.

Functions on the other hand are typed, which makes them more strict or, phrased negatively, less flexible. Consider the functions

uintmax_t absU(uintmax_t a) {
  return a;
uintmax_t absS(uintmax_t a) {
   return (-a < a) ? -a : a;

The first implements the trivial abs function for an unsigned integral type. The second implements it for a signed type. (Yes, it takes an unsigned as argument, this is for purpose.)

We may use these with any integral type. But, the return type will always be of the largest width and there is a certain difficulty on knowing how do choose between the two.

Now with the following macro

#define ABS(T, A) ((T)(P99_ISSIGNED(T) ? absS : absU)(A))

we have implemented a

  • family of functions
  • that works for any integral type
  • that evaluates its argument only once
  • for which any recent and decent compiler will create optimal code

Well, I admit that doing this with abs is a bit artificial, but I hope you get the picture.

Scope Bound Resource Management with for Scopes

Resource management can be tedious in C. E.g to protect a critical block from simultaneous execution in a threaded environment you’d have to place a lock / unlock pair before and after that block:

pthread_mutex_t guard = PTHREAD_MUTEX_INTIALIZER;

// critical block comes here

This is very much error prone since you have to provide such calls every time you have such a block. If the block is longer than some lines it is difficult to keep track of that, since the lock / unlock calls are spread on the same level as the other code.

Within C99 (and equally in C++, BTW) it is possible to extend the language of some sorts such that you may make this easier visible and guarantee that your lock / unlock calls are matching. Below , we will give an example of a macro that will help us to write something like

    pthread_mutex_unlock(&guard)) {
       // critical block comes here

If we want to make this even a bit more comfortable for cases that we still need to know the mutex variable we may have something like:

       // critical block comes here

The macro P99_PROTECTED_BLOCK can be defined as follows:

#define P99_PROTECTED_BLOCK(BEFORE, AFTER)                         \
for (int _one1_ = 1;                                               \
     /* be sure to execute BEFORE only at the first evaluation */  \
     (_one1_ ? ((void)(BEFORE), _one1_) : _one1_);                 \
     /* run AFTER exactly once */                                  \
     ((void)(AFTER), _one1_ = 0))                                  \
  /* Ensure that a `break' will still execute AFTER */             \ 
  for (; _one1_; _one1_ = 0)

As you may see, this uses two for statements. The first defines an auxiliary variable _one1_ that is used to control that the dependent code is only executed exactly once. The arguments BEFORE and AFTER are then placed such that they will be executed before and after the dependent code, respectively.

The second for is just there to make sure that AFTER is even executed when the dependent code executes a break statement. For other preliminary exits such as continue, return or exit there is unfortunately no such cure. When programming the dependent statement we have to be careful about these, but this problem is just the same as it had been in the “plain” C version.

Generally there is no run time performance cost for using such a macro. Any decent compiler will detect that the dependent code is executed exactly once, and thus optimize out all the control that has to do with our variable _one1_.

The GUARDED_BLOCK macro could now be realized as:

#define GUARDED_BLOCK(NAME)        \
P99_PROTECTED_BLOCK(               \
    pthread_mutex_lock(&(NAME)),   \

Now, to have more specific control about the mutex variable we may use the following:

for (int _one1_ = 1; _one1_; _one1_ = 0)                             \
  for (T NAME = (INITIAL);                                           \
       /* be sure to execute BEFORE only at the first evaluation */  \
       (_one1_ ? ((void)(BEFORE), _one1_) : _one1_);                 \
       /* run AFTER exactly once */                                  \
       ((void)(AFTER), _one1_ = 0))                                  \
    /* Ensure that a `break' will still execute AFTER */             \
    for (; _one1_; _one1_ = 0)

This is a bit more complex than the previous one because in addition it declares a local variable NAME of type T and initializes it.

Unfortunately, the use of static for the declaration of a for-scope variable is not allowed by the standard. To implement a simple macro for a critical section in programs that would not depend on any argument, we have to do a bit more than this.

Other block macros that can be implemented with such a technique:

  • pre- and postconditions
  • make sure that some dynamic initialization of a static variable is performed exactly once
  • code instrumentation

P99 now has a lot of examples that use this feature.

va_arg functions and macros

Traditionally C has functions with a variable length argument list, so-called variadic functions. The handling of such arguments is done with the va_list data type from stdarg.h and the corresponding macros. I see two pitfalls with this type of approach that usually make it relatively difficult to use, even in cases where the arguments are supposed to be all the same type T.

  • There is no indication by these macros how long the list that is passed as argument is.
  • There is an implicit conversion of small integer arguments to signed or unsigned int according to the integer promotion rules. These types only have an implementation defined width.

Then first pitfalls requires that usually we need to apply one of the following techniques to handle the list:

  • Terminate the list at each call by a special value. This convention has the disadvantage that each caller has to follow this rule and that failing to do so might produce errors that are hard to track.
  • Provide a count of the arguments as an extra parameter that precedes the list. Whereas here also the calling side must do something for each call, at least the convention can be determined from the prototype of the function.
  • As a variation of this gives a format string of how the arguments are to be interpreted. The printf family of functions uses this approach,

Since C99 we now have macros with variable length argument lists. These can be used to interface functions that obtain a length parameter and an array of type T and that then are much easier to use on the calling side. Suppose that we have a function varArrFunc and a macro varListMacro as follows (for and explanation of the implementation see below)

   #define P99_CALL_VA_ARG(NAME, TYPE, ...)  (NAME(P99_NARG(__VA_ARG__), (TYPE[]){ __VA_ARG__ }))

   void varArrFunc(size_t len, T* A);
   #define varListMacro(...)  P99_CALL_VA_ARG(varArrFunc, T,  __VA_ARG__ )

Such a macro/function pair may then just be called as varListMacro(78, 7, 9, 99) or varListMacro("a", "toto"), if for the first example we assume that T is compatible with int or for the second that it is with char*. As we can see this avoids both pitfalls

  • There is no need to have a calling side convention to handle the length of the argument list.
  • All argument conversion is to a type T that we specify clearly in the definition of varListMacro. If e.g we specify T to be uint64_t we will always know which value the function varArrFunc will see if we feed in (signed char)-1 as an argument.

How does this work? First we need a macro P99_NARG(...) that provides us with the number of arguments that it receives. We showed how to implement such a macro in a earlier post. Then in its second part the macro P99_CALL_VA_ARG uses a compound literal to pass an array of base type T with our arguments as initial values to the function varArrFunc.

Such an implementation is at least as efficient as would be an implementation of varArrFunc itself as a variadic function.

  • The length of the array is computed at compile time. It is known there, so the information should not get lost.
  • As for the variadic function approach at run time each individual argument is only evaluated once.
  • Where the variadic function approach would implement the argument list on the stack of the callee, here the array is implemented on the stack of the caller. In any case it is on the stack. For any of the calling conventions that we mentioned above we would either need an extra terminating argument or an extra parameter, so our use of a length parameter to varArrFunc is as efficient as that.
  • As an extra bonus, the call to varArrFunc may even be inlined, if we specify it with inline. This then may lead to optimizations that generally are more difficult to achieve for the variadic function approach:
    • The handling the array A of parameters inside varArrFunc will usually be done with a simple for-loop.
    • This loop then has known bounds for each call and the compiler may do loop unrolling
    • Once unrolled, the compiler might even avoid the whole generation of the array and use the parameter expressions directly.

Associativity of ##, double constants and preprocessor tokens

You might one day be confronted to the need to compose double constants by using the preprocessor. This is a tricky affair, since already the first naive try like this doesn’t work:


Why is that so? For the preprocessor the dot and the following parameter are separate tokens. Thus called e.g as FRACTIONAL_WRONG(1) something like ‘. 1’ would be produced a stray dot followed by a blank and a number. This is nowhere a valid token sequence for the C compiler. And obviously the following macro, meant to produce a fractional number is wrong for the same reasons:


Ok, we all know, to glue together tokens there is the
## operator in the preprocessor. The following actually


/* using it */

But we will see below that this is somehow just be coincidence.

Let us now try to generalize our idea to produce general doubles, including an exponent. One could be tempted to try something like this:


That is, we would try to first write an analogous macro that composes the exponent and then try to combine the two parts into one global macro. For this seemingly innocent declaration

static double b = DOUBLE_WRONG(-, 4, 01, +, 5);

My preprocessor says something weird like

error_paste.c:27:1: error: pasting "E" and "+" does not give a valid preprocessing token
error_paste.c:27:1: error: pasting "+" and "5" does not give a valid preprocessing token

And yours should say something similar, if it is standard compliant. The problem is that a preprocessor token that starts with an alphabetic character may only contain alphanumeric characters (plus underscore). Our example for FRACTIONAL only worked, because by chance a `dot’ followed by numbers is a valid token by itself, namely a floating point number.

A more direct approach would be to have a macro that pastes 6 tokens together

#define PASTE6_NOTSOGOOD(a, b, c, d, e, f) a ## b ## c ## d ## e ## f

and then hoping that something like the following would work:


static double b = DOUBLE_NOTSOGOOD(-, 4, 01, +, 5);

An for most preprocessors it would: glued together from left to right each intermediate step would always consist of a valid preprocessor token. The actual rules of the preprocessor that allow for this are a bit more complicated, but basically in addition to alphanumeric tokens all starting parts of double constants (without prefix sign) are valid preprocessor tokens. ouff…

… you think. But there is a last subtlety which is the associativity of the ## operator. It is not specified whether or not it is from left to right. If we fall upon one that does it from right to left, we are screwed. So if we want to be portable, we have to go even further.

#define PASTE2(a, b) a ## b
#define _PASTE2(a, b) PASTE2(a, b)
#define PASTE3(a, b, c) _PASTE2(PASTE2(a, b), c)
#define PASTE4(a, b, c, d) _PASTE2(PASTE3(a, b, c), d)
#define PASTE5(a, b, c, d, e) _PASTE2(PASTE4(a, b, c, d), e)
#define PASTE6(a, b, c, d, e, f) _PASTE2(PASTE5(a, b, c, d, e), f)

static double b = PASTE6(4, ., 01, E, +, 7);