Semaphores and Mutexes aren’t so similar

A common misconception I read about the two POSIX types sem_t and pthread_mutex_t are that that they’d be similar, almost interchangeable it seems. They have indeed some things in common:

  • both are control structures that can be placed in memory that is shared between threads (or processes, we will concentrate on threads, here)
  • both can be used to block a thread to wait for another one
  • both have three different variants for such a wait function
    1. Unconditional, blocking wait, sem_wait and pthread_mutex_lock.
    2. Non-blocking wait, sem_trywait and pthread_mutex_trylock.
    3. Time dependent wait, sem_timedwait and pthread_mutex_timedlock.

But in other aspects they are quite different. The main difference, already on a conceptual level that I see is the following:

  • pthread_mutex_t is thread oriented. If a mutex is hold by a particular thread it can only be unlocked by this particular thread alone.
  • sem_t is token oriented. If a thread is blocked on a semaphore, any active thread may post on the semaphore.

This already reserves certain surprises to the unaware, and each of these concepts as it own sets of pitfalls that have to be taken care of.

But then, the special specification for the mutex and semaphore concepts as they are given by POSIX adds a particular oddity that makes their use a bit tricky for starters:

  • The semaphore calls can be interrupted at any time, e.g by IO that is delivered or if a process child terminates. In such a case errno is set to EINTR.
  • A thread A that is blocked in pthread_mutex_lock can receive as much signals as it would. It will only return to the application once the thread B that holds the lock releases it. If B returns, crashes, idles for whatever reason, A will be blocked.

The reason for that is just that sem_t is designed to be used in signal handlers, so it must be sensible to the fact that an interrupt arrived. phtread_mutex_t is better not used in interrupt handlers, unless you really know what you are doing.

In subsequent blogs, I will try to show the particular advantages or pitfalls of the two different interfaces


Integer type confusion

Many people still use int as the integer type to be used in C. Don’t do that. This is misleading, non portable (yes!) and error prone.

  • Valid use of int is for the return code of a system function or errno.
  • Valid use of char is for the character in a string.
  • Valid use of char* is byte arithmetic on pointers.

That’s it really.

Traditionally, integer types in C are a mess, already syntactically. You can have combinations such as long long unsigned int etc that make the code really hard to read. For the following we will use the 10 different real primitive types that C99 provides and for 6 of them use typedef to have an abbreviation for each of them

typedef signed char schar;
typedef unsigned char uchar;
typedef unsigned short ushort;
typedef unsigned long ulong;
typedef signed long long sllong;
typedef unsigned long long ullong;

To a signed int we will just refer as signed and to an unsigned int as unsigned and we will use short and long for the other two. All these 10 refer to different types for the compiler but may not necessarily be of a distinct width, or of just the width you suppose them to be. No, int is not always 32 bit wide, no, long is not always 64 bit wide, be carefull.

There is another integer type in C99 that doesn’t get too much attention, _Bool (or bool if you include “stdbool.h”). This blog entry here doesn’t go much into the details of Boolean logic, but you should use bool whenever it is appropriate for a variable that has the semantics of a truth value.

Pitfall Number One: array indexing

Arrays in C are indexed from 0 up to n-1, where n is the number of elements in the array, you all know that. So indexing an array with a negative number is something odd, that should only happen under very controlled circumstances.

Never use a signed value to index an array or to store its bounds

unless you are certain of what you are doing. But just as a reflex, don’t use int, signed or something similar.

Use the unsigned integer type of your system that has the correct width to address any size of object that is possible on your system. This type has a name: size_t. Use it, it is as simple as that.

Use size_t to index an array or to store its bounds

By that your code becomes portable since this is well defined on every modern system and you are much less vulnerable to overflow problems.

Pitfall Number Two: type width

First, the width of a type (even if it is unsigned) is not forcibly in direct relation with its size as returned by the sizeof operator: there may unused bits, so-called padding bits.

Then, don’t make false assumptions on the width of any of the 10 integer types. Such assumptions are the main source for 32 versus 64 bit portability problems. The only reasonable assumptions that you can make are for the size

1 = sizeof(schar) <= sizeof(short) <= sizeof(signed) <= sizeof(long) <= sizeof(sllong)
1 = sizeof(uchar) <= sizeof(ushort) <= sizeof(unsigned) <= sizeof(ulong) <= sizeof(ullong)

There are some other properties about the minimum width of these types, but if you stick to the following you wouldn’t need them.

Whenever you use an integer as bitfield think twice of what you assume about the number of bits of the underlying type. Same holds, obviously, if you assume something about the maximum or minimum value of the type. The [u]intX_t family of typedef that is guaranteed to exist for C99 is there for you. Here the X stands for the width that you want, typical values for X are 8, 16, 32, and 64. E.g uint64_t is an unsigned integer type of exactly 64 bits. And so you see also there are 4 such commonly used width but 5 different types. So usually there are at least two basic types with the same width.

Use [u]intX_t whenever you make an assumption about the width of a type

Edit: Well, the [u]intX_t types are only required by POSIX, C99 leaves them optional. C99 only requires the types [u]int_fastX_t to exist. On most commodity architectures [u]intX_t will exist nowadays, but if you want to be sure that your code also is valid on embedded platforms and alike, use [u]int_fastX_t.

If you need the maximum value of an unsigned integer type T, (T)-1 will always do the trick, no need to know about magic constants. If you need to know if a type is signed or unsigned, ((T)-1 < (T)0) will do. Since these are a bit ugly, I use the following macros

#define P99_0(T) ((T)0)
#define P99_M1(T) ((T)-1)
#define P99_ISSIGNED(T) (P99_M1(T) < P99_0(T))

Theses are compile time constant expressions so you may use them even in type-declarations or static initializations as it pleases.

Pitfall Number Three: signedness

Don’t mix signed and unsigned types in comparisons. The promotion rules are complicated and do not only involve the signedness but also whether or not the type is long. In particular, the resulting real comparison may be a comparison of signed values or of unsigned values. Don’t rely on that.

When comparing values of different signedness, use a cast to promote one of them the signedness of the other.

This makes your intentions explicit and easier to follow.

BTW, the correct signed type for size_t is ssize_t, not ptrdiff_t.

Pitfall Number Four: char arithmetic

Signedness of char is simply a mess, don’t rely on it.

When you need a `small’ integer type don’t use char but use [u]int8_t.

Use char as the base type for strings, and char* as a generic pointer type whenever you have to do byte arithmetic on pointers. All other uses of char should be banned.

Pitfall Number Five: file offsets

File offsets (and also block sizes) are yet another value that are independent from pointer or integer width. You may well have a 32 bit system (maximal addressable space 4GiB) with a 64bit file system: think of a large disk partition or of a mount over NFS. The correct type to access file offsets is off_t, which is a signed type. Usually, there are predefined macros that force your system to use the 64 bit variant, if it is available.

Pitfall Number Six: pointer conversions and differences

Generally it is a bad idea to convert a pointer to an integer. It really is. Think of it, twice. If it is unavoidable, int is certainly the wrong type. The reason for that is that the most convenient integer type of the system may be of a certain width x and the width of pointers may be just another value y.

Don’t suppose that pointers and integers have the same width

Use the correct types to transform a pointer to an integer. These have predefined names, too, intptr_t and uintptr_t, if they exist. If they don’t exist on your system, well really don’t do it. They know why they don’t provide it. On one system these might be signed and unsigned on another long and ulong.

When you do pointer arithmetic, use ptrdiff_t for differences between pointers. This is guaranteed to have the correct width. But… even then there might be extreme cases where you have an overflow. If you assume for a moment that pointers are 32 bit and you take the difference of one pointer on the stack (often numbers of the form 0xFFFF….) and another pointer deep down that difference might need the highest bit to code the number and thus if ptrdiff_t is also 32 bit wide, the number might overflow. Be carefull.

Obfuscation or inventing a new operator tends to operator -->

I found a really nice one in this discussion here. Basically the idea is that you may format operators a bit to show this code, which is valid C99 and C++:

for (unsigned x = 10; x --> 0;)
     printf("%u\n", x);

Ain’t that cute?

Now I was thinking that in C++ we could really obfuscate that much better by inventing some helper class. I came up with the following class Heron that `converges’, (written as aHeron --> eps) towards the square root of the initial value.


#include <iostream>
#include <math.h>

using std::cout;
using std::endl;

struct Heron {
  double const a;
  double x;
  Heron(double _a) : a(_a), x(_a) { }
  operator double(void){ return x; }

class Heron_tmp;

Heron_tmp operator--(Heron& h, int);

class Heron_tmp {
  friend class Heron;
  friend Heron_tmp operator--(Heron& h, int);
  Heron* here;
  Heron_tmp(Heron& h) : here(&h) { }
  inline int operator>(double err) const;

Heron_tmp operator--(Heron& here, int) {
  return here;

int Heron_tmp::operator>(double err) const {
  double& x = here->x;
  double const& a = here->a;
  x = (x + a/x) * 0.5;
  return fabs((x*x - a)/a) > err;

int main(void) {
  Heron aHeron(2.0);
  while (aHeron --> 1E-15)
    cout << (double)aHeron << endl;

initialization of dynamically allocated struct

Other than static initialization, for stack or global variables C does not foresee much for storage that is allocated through malloc. I have set up an easy convention for myself to deal with this. It resembles much constructors as we know them from C++.

typedef struct {
 int a;
 double b;
} A;
#define A_INITIALIZER { .a = 1, .b = 0.5 }

static inline
A* A_init(A *t) {
 if (t) *t = (A)A_INITIALIZER;
 return t;

That is a function that receives a pointer of my struct as first argument, initializes the storage it points to, and then returns the same pointer it received as an argument. In this example the initialization is done by assigning a compound literal, but we could in fact place any code we need to initialize a variable of type A here.

Depending on the type, such an `init’ function may receive more arguments

typedef struct {
 int a;
 double b;
} B;
#define B_INITIALIZER(F) { .a = 1, .b = F }

static inline
B* B_init(B *t, double f) {
  if (t) *t = (B)B_INITIALIZER(f);
  return t;

The return value comes handy when I want to allocate and initialize with just one expression. With the above types we can do

A* anA = A_init(malloc(sizeof(A)));
B* aB = B_init(malloc(sizeof(B)), 0.7);

Observe that there is no cast of the return from malloc. Such a cast could hide a missing prototype for malloc and as a result expose you to weird portability bugs. Also observe that I have made the “init” function robust against null pointers, so if the malloc doesn’t work, you’d still have to check anA or aB for null.

We may then use all of that to construct a C++ like NEW macro that allocates and initializes.

#define _NEW_ARGS(T, ...) T ## _init(malloc(sizeof(T)), __VA_ARGS__)
#define _NEW(T) T ## _init(malloc(sizeof(T)))
#define NEW(...) IF_1_ELSE(__VA_ARGS__)(_NEW(__VA_ARGS__))(_NEW_ARGS(__VA_ARGS__))

A* anotherA = NEW(A);
B* anotherB = NEW(B, 99.0);

The inner part of the macro comes in two flavors, one for the case that NEW is called with just the type as an argument, the other one if there are other arguments. These then are just passed through to the corresponding init function.

Note that all this only works by naming convention and typedef. A typedef is necessary to have the name of the type in one token. Something like struct B or unsigned long would make it impossible to define the init functions as we do. If we have several typedef for the same type, we have to declare an init function for each of these names.

Edit: In P99, now there is P99_NEW, a macro that may receive a variable argument list, and thus is able to handle default arguments to the “init” function.

Consistent initialization of static and auto variables

Lack of initialization, especially of pointer fields, is one of the most sources of bugs in C programs that easily lead to crashes and other undefined behavior. C99 give us a feature that helps at least a little bit with this, which are named initializers.

    typdedef struct {
      int a;
      double d;
    } A;
    typedef struct {
      A b;
      char* name;
    } B;

With C89 you would do something like this to initialize variables of these types:

    A anA = { -1, 1.0 };
    B aB = { { -1, 1.0 }, NULL };

This has several down points. For anA you have to know that field a is actually the first in type A. Whenever you add a field to the definition of A, you will have to check all initializations of all variables of that type if they are still doing the right thing. This might be just infeasible on a large code base and difficult to handle if you have fields of the same type but that should get different values.

In C99 you may write

    A anA = { .a = -1, .d = 1.0 };

which is much clearer, and stable with respect to ordering of the fields or addition of other fields. In the same spirit for aB we should do

    B aB = { .b = { .a = -1, .d = 1.0 }, .name = NULL };

In many cases this is quite inconvenient to type and often repetitive (thus error prone). Many structures have some `default’ or `standard’ way how they are to be initialized. For myself I have a simple naming convention for that where I would do

#define A_INITIALIZER  { .a = -1, .d = 1.0 }
#define B_INITIALIZER  { .b = A_INITIALIZER, .name = NULL }


If this makes sense for the type under consideration you might even give arguments to these defines. We could e.g make the following change for both types such that the field name in both always reflects how the corresponding variable had been named.

    typdedef struct {
      int a;
      double d;
      char const*const name;
    } A;
    typedef struct {
      A b;
      char const*const name;
    } B;

#define A_INITIALIZER(NAME)  { .a = -1, .d = 1.0, .name = #NAME }
#define B_INITIALIZER(NAME)  { .b = A_INITIALIZER(NAME.b), .name = #NAME }

    A anA = A_INITIALIZER(anA);

Associativity of ##, double constants and preprocessor tokens

You might one day be confronted to the need to compose double constants by using the preprocessor. This is a tricky affair, since already the first naive try like this doesn’t work:


Why is that so? For the preprocessor the dot and the following parameter are separate tokens. Thus called e.g as FRACTIONAL_WRONG(1) something like ‘. 1’ would be produced a stray dot followed by a blank and a number. This is nowhere a valid token sequence for the C compiler. And obviously the following macro, meant to produce a fractional number is wrong for the same reasons:


Ok, we all know, to glue together tokens there is the
## operator in the preprocessor. The following actually


/* using it */

But we will see below that this is somehow just be coincidence.

Let us now try to generalize our idea to produce general doubles, including an exponent. One could be tempted to try something like this:


That is, we would try to first write an analogous macro that composes the exponent and then try to combine the two parts into one global macro. For this seemingly innocent declaration

static double b = DOUBLE_WRONG(-, 4, 01, +, 5);

My preprocessor says something weird like

error_paste.c:27:1: error: pasting "E" and "+" does not give a valid preprocessing token
error_paste.c:27:1: error: pasting "+" and "5" does not give a valid preprocessing token

And yours should say something similar, if it is standard compliant. The problem is that a preprocessor token that starts with an alphabetic character may only contain alphanumeric characters (plus underscore). Our example for FRACTIONAL only worked, because by chance a `dot’ followed by numbers is a valid token by itself, namely a floating point number.

A more direct approach would be to have a macro that pastes 6 tokens together

#define PASTE6_NOTSOGOOD(a, b, c, d, e, f) a ## b ## c ## d ## e ## f

and then hoping that something like the following would work:


static double b = DOUBLE_NOTSOGOOD(-, 4, 01, +, 5);

An for most preprocessors it would: glued together from left to right each intermediate step would always consist of a valid preprocessor token. The actual rules of the preprocessor that allow for this are a bit more complicated, but basically in addition to alphanumeric tokens all starting parts of double constants (without prefix sign) are valid preprocessor tokens. ouff…

… you think. But there is a last subtlety which is the associativity of the ## operator. It is not specified whether or not it is from left to right. If we fall upon one that does it from right to left, we are screwed. So if we want to be portable, we have to go even further.

#define PASTE2(a, b) a ## b
#define _PASTE2(a, b) PASTE2(a, b)
#define PASTE3(a, b, c) _PASTE2(PASTE2(a, b), c)
#define PASTE4(a, b, c, d) _PASTE2(PASTE3(a, b, c), d)
#define PASTE5(a, b, c, d, e) _PASTE2(PASTE4(a, b, c, d), e)
#define PASTE6(a, b, c, d, e, f) _PASTE2(PASTE5(a, b, c, d, e), f)

static double b = PASTE6(4, ., 01, E, +, 7);

Detect empty macro arguments

The macro NARG2 that we introduced in post still has a major disadvantage, it will not be able to detect an empty argument list. This is due to a fundamental difference between C and its preprocessor. For C a parenthesis () is empty and contains no argument. For the preprocessor it contains just one argument, and this argument is the empty token.

So in fact NARG2 is cheating. It doesn’t count the number of arguments that it receives, but returns the number of commas plus one. In particular, even if it receives an empty argument list it will return 1. The following two macros better expresses this property:

#define _ARG16(_0, _1, _2, _3, _4, _5, _6, _7, _8, _9, _10, _11, _12, _13, _14, _15, ...) _15
#define HAS_COMMA(...) _ARG16(__VA_ARGS__, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0)

As before, these are constrained to a maximum number of arguments (here 16), but you will easily work out how to extend it to larger maximum values.

Now, when we want to write a macro that detects an empty argument we will be using the feature that a function macro that is not followed by an open parenthesis will be left alone. This we will do with the following macro that just transforms the following parenthesis and contents into a comma.

#define _TRIGGER_PARENTHESIS_(...) ,

The idea is to put the arguments that we want to test between the macro and its parenthesis, such that the macro only triggers if the arguments are empty:


To do so we, must first check for some corner cases where special properties of the argument might mimic the behavior that we want to test.

#define ISEMPTY(...)                                                    \
_ISEMPTY(                                                               \
          /* test if there is just one argument, eventually an empty    \
             one */                                                     \
          HAS_COMMA(__VA_ARGS__),                                       \
          /* test if _TRIGGER_PARENTHESIS_ together with the argument   \
             adds a comma */                                            \
          HAS_COMMA(_TRIGGER_PARENTHESIS_ __VA_ARGS__),                 \
          /* test if the argument together with a parenthesis           \
             adds a comma */                                            \
          HAS_COMMA(__VA_ARGS__ (/*empty*/)),                           \
          /* test if placing it between _TRIGGER_PARENTHESIS_ and the   \
             parenthesis adds a comma */                                \
          HAS_COMMA(_TRIGGER_PARENTHESIS_ __VA_ARGS__ (/*empty*/))      \

Here we distinguish four different cases, of which the last is just the main idea as exposed above. The first helps to exclude the trivial case, that __VA_ARGS__ already contains a comma by itself. The two others test if the two possible combinations of _TRIGGER_PARENTHESIS_, __VA_ARGS__ and (/*empty*/) also already trigger a comma to their output.

Now the outcome of this will be calling the macro _ISEMPTY with four different 0-1-values according to different cases that __VA_ARGS__ presents. In particular, the case that __VA_ARGS__ is empty corresponds exactly to the outcome _ISEMPTY(0, 0, 0, 1). All other outcomes will indicate that it was non-empty. We will detect this case with the following helper macro.

#define _IS_EMPTY_CASE_0001 ,

and we leave all the other 15 cases undefined. Now with

#define PASTE5(_0, _1, _2, _3, _4) _0 ## _1 ## _2 ## _3 ## _4
#define _ISEMPTY(_0, _1, _2, _3) HAS_COMMA(PASTE5(_IS_EMPTY_CASE_, _0, _1, _2, _3))

we will exactly detect the case we are interested in.

As a test here comes all of that together in a block. This is not a reasonable C program but just something to run through the preprocessor to test the validity of the approach.

#define _ARG16(_0, _1, _2, _3, _4, _5, _6, _7, _8, _9, _10, _11, _12, _13, _14, _15, ...) _15
#define HAS_COMMA(...) _ARG16(__VA_ARGS__, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0)
#define _TRIGGER_PARENTHESIS_(...) ,

#define ISEMPTY(...)                                                    \
_ISEMPTY(                                                               \
          /* test if there is just one argument, eventually an empty    \
             one */                                                     \
          HAS_COMMA(__VA_ARGS__),                                       \
          /* test if _TRIGGER_PARENTHESIS_ together with the argument   \
             adds a comma */                                            \
          HAS_COMMA(_TRIGGER_PARENTHESIS_ __VA_ARGS__),                 \
          /* test if the argument together with a parenthesis           \
             adds a comma */                                            \
          HAS_COMMA(__VA_ARGS__ (/*empty*/)),                           \
          /* test if placing it between _TRIGGER_PARENTHESIS_ and the   \
             parenthesis adds a comma */                                \
          HAS_COMMA(_TRIGGER_PARENTHESIS_ __VA_ARGS__ (/*empty*/))      \

#define PASTE5(_0, _1, _2, _3, _4) _0 ## _1 ## _2 ## _3 ## _4
#define _ISEMPTY(_0, _1, _2, _3) HAS_COMMA(PASTE5(_IS_EMPTY_CASE_, _0, _1, _2, _3))
#define _IS_EMPTY_CASE_0001 ,

#define EATER0(...)
#define EATER1(...) ,
#define EATER2(...) (/*empty*/)
#define EATER3(...) (/*empty*/),
#define EATER4(...) EATER1
#define EATER5(...) EATER2
#define MAC0() ()
#define MAC1(x) ()
#define MACV(...) ()
#define MAC2(x,y) whatever
ISEMPTY(a, b, c)
ISEMPTY(a, b, c, d)
ISEMPTY(a, b, c, d, e)
ISEMPTY((void), b, c, d)
/* This one will fail because MAC2 is not called correctly */

Edit: What is presented here is a version with minor improvement to capture the case of MAC0. Notice the restriction on the argument of ISEMPTY that is apparent with MAC2. In fact ISEMPTY should work when it is called with macros as argument that expect 0, 1 or a variable list of arguments. If called with a macro X as an argument that itself expects more than one argument (such as MAC2) the expansion leads to an invalid use of that macro X.

Default arguments for C99

Provide a way to have default arguments to functions in C99.

From C++ we know the concept of default arguments for functions. Briefly stated, this concept allows to call a function with less arguments than it is specified and provide default values for the missing ones. As an example lets take the following prototype:

      void one_or_two(int a, int b = 5);

Here one_or_two may be called with one or two arguments and in the case of one, the value 5 is provided.

The goal here is to have the same effect in C, C99 to be precise, by using the macro preprocessor and inline functions. To do that will use the following elements/features

  1. Macros that hide a function
  2. Counting macro arguments
  3. Choose the right version of a function call
  4. inline functions for default arguments

Macros that hide a function

An important property of the C preprocessor is its lack of recursion. At a first glance it looks that this is merely a restriction and not an advantage. But here the fact that using a macro name inside the expansion of that same macro is just left `as is’ is quite helpful. But even more than that, a macro that is defined to receive arguments, but that is found without following parenthesis, is also left alone. The global view of our macro will be something like the following:

#define one_or_two(...) ONE_OR_TWO_ARGS(one_or_two, __VA_ARGS__)

So here, when calling the macro in a program the macro definition will be inserted and the token one_or_two which is found there will be left as such. Then, the remaining part of the arguments is expanded and if ONE_OR_TWO_ARGS is itself a macro, it will be expanded.

Here we also use a feature that is only normalized since C99, variable macro arguments. This is indicated by the ... in the definition of one_or_two. By that, one_or_two may be called with any number of arguments and the token __VA_ARGS__ in the definition is replaced by the arguments, including the commas between them.

Counting macro arguments

To implement the macro ONE_OR_TWO_ARGS we will need to determine how many arguments it receives. This can be achieved with something like the following:

#define _ARG2(_0, _1, _2, ...) _2
#define NARG2(...) _ARG2(__VA_ARGS__, 2, 1, 0)

If NARG2 is called with two arguments the 2 of its expansion is in third position and we will see this 2. If it is called with just one argument the 1 will be in that place and thus be the result of the expansion. You probably easily imagine an extension of that macro to treat say 64 or 128 arguments.

Choose the right version of a function call

Finally we want two different versions of ONE_OR_TWO_ARGS, one if it is called with one argument and one for two:

#define _ONE_OR_TWO_ARGS_1(NAME, a) a, NAME ## _default_arg_1()
#define _ONE_OR_TWO_ARGS_2(NAME, a, b) a, b

Both macros receive in NAME the name of the function that this all about. But only _ONE_OR_TWO_ARGS_1 uses it to produce the name of a default function to call. If NAME would be one_or_two the function call produced by the preprocessor would be one_or_two_default_arg_1(). This is achieved by the ## operator of the preprocessor that glues two adjacent tokens into one.

A mechanism to choose between the two now could look like this

#define __ONE_OR_TWO_ARGS(NAME, N, ...) _ONE_OR_TWO_ARGS_ ## N (NAME, __VA_ARGS__)

__ONE_OR_TWO_ARGS is supposed to receive N the actual number of arguments in __VA_ARGS__ as it second argument and constructs a call to either _ONE_OR_TWO_ARGS_1 or _ONE_OR_TWO_ARGS_2.

_ONE_OR_TWO_ARGS is just an intermediate step that ensures that all arguments are evaluated sufficiently often such that really the value of the call to NARG2 is put in place. Otherwise _ONE_OR_TWO_ARGS_ and NARG2 would be glued into a single (sensless) token _ONE_OR_TWO_ARGS_NARG2.

Finally, ONE_OR_TWO_ARGS places the name of the function, determines the number of arguments it has received and calls _ONE_OR_TWO_ARGS.

inline functions for default arguments

Remains to define the function for the default argument. It might look like the following:

static inline
int one_or_two_default_arg_1(void) {  return 5; }

Here the inline keyword (stolen from C++ and new in C99) suggests that the function body should be `substituted’ in place where a call to it appears. So at a first glance it looks that this defines something similar to a macro, but actually an important difference is the point where evaluation takes place; an inline function itself is evaluated in the context in which it is defined. Only its arguments are evaluated at the place it is called. On the other hand, a macro is not evaluated at the point of its definition but at the point of its call.

To see that take the case that a default function is not just evaluating a constant expression but doing some real work.

static inline
int inline_counter_default_arg_1(void) { ++myCounter, return myCounter; }
#define macro_counter_default_arg_1(void) (++myCounter)

Here inline_counter_default_arg_1 uses the global counter variable myCounter that is visible at the point of its definition. If there is none, this results in an error. macro_counter_default_arg_1 evaluates myCounter in the context of the caller, and this might actually refer to different variables at different places. The first results in the rule which C++ implements for default arguments: they are evaluated in the scope of definition. The second one is a different model for which C++ has no equivalent.

Finally a complete example.

#include <stdio.h>
#define _ARG2(_0, _1, _2, ...) _2
#define NARG2(...) _ARG2(__VA_ARGS__, 2, 1, 0)
#define _ONE_OR_TWO_ARGS_1(NAME, a) a, NAME ## _default_arg_1()
#define _ONE_OR_TWO_ARGS_2(NAME, a, b) a, b
#define __ONE_OR_TWO_ARGS(NAME, N, ...) _ONE_OR_TWO_ARGS_ ## N (NAME, __VA_ARGS__)
#define one_or_two(...) ONE_OR_TWO_ARGS(one_or_two, __VA_ARGS__)

// function definition, also calls the macro, but you wouldn't notice
void one_or_two(int a, int b) { printf("%s seeing a=%d and b=%d\n", __func__, a, b); }

static inline int one_or_two_default_arg_1(void) {  return 5; }

int main (void) {
  // call with default argument
  // call with default argument
  one_or_two(6, 0);
  // taking a function pointer still works
  void (*func_pointer)(int, int) = one_or_two;
  // But this pointer may only be called with the complete set of
  // arguments
  func_pointer(3, 4);

Right shift on signed types is not well defined

The shift operators (<< and >>) shift the bits in a word to the left or the right. From such an explanation it doesn’t follow directly what should happen with the bits at the word boundaries. There are several commonly used strategies

  • logical:
    Bits that go beyond the word boundary are dropped and the new positions are filled with zeroes.
  • ones:
    Bits that go beyond the word boundary are dropped and the new positions are filled with ones.
  • arithmetic:
    1. Shift is `logical’ for positive values.
    2. For negative values right shift is `ones’ and
    3. left shift is `logical’ but always sets the highest order bit (sign bit) to 1.
  • circular: Bits that go beyond the word boundary are reinserted at the other end.

`Arithmetic’ shift has its name from the fact that it implements an integer multiplication or division by a power of two.

For unsigned integer types C prescribes that the shift operators are `logical’ . So e.g (~0U >> 1) results in a word of all ones but for the highest order bit which is 0. The picture darkens when it comes to signed types. Here the compiler implementor may choose between a `logical’ and an `arithmetic’ shift. Basically this means that the use of the right shift operator on signed values is not portable unless very special care is taken. We can detect which shift is implemented by the simple expression ((~0 >> 1) < 0)

  • If the shift is `logical’ the highest order bit of the left side of the comparison is 0 so the result is positive.
  • If the shift is `arithmetic’ the highest order bit of the left side is 1 so the result is negative.

Observe in particular that in case of an arithmetic shift (~0 >> 1) == ~0. So this operator has two fixed points in that case, 0 and -1. If we want a portable shift we may choose the following operations

#define LOGSHIFTR(x,c) (((x) >> (c)) &amp; ~(~0 << (sizeof(int)*CHAR_BIT - (c)))

This produces a mask with the correct number of 1’s in the low order bits and performs a bitwise and with the result of the compiler shift. Observe

  • This supposes that x is of type int, a type independent definition would be much more complicated.
  • c is evaluated twice so don’t use side effects here.

Here is a C99 program to test your compiler.

#include <limits.h>
#include <stdio.h>

int logshiftr(int x, unsigned c);

int arishiftr(int x, unsigned c);

#define HIGHONES(c) ((signed)(~(unsigned)0 << (sizeof(signed)*CHAR_BIT - (c))))
#define HIGHZEROS(c) (~HIGHONES(c))

int logshiftr(int x, unsigned c) {
  return (x >> c) &amp; HIGHZEROS(c);

int arishiftr(int x, unsigned c) {
  return logshiftr(x, c) ^ (x < 0 ? HIGHONES(c) : 0);

int main(int argc) {
  int b = argc > 1 ? argc : 0;
  int val[11u] = { b, b + 1, b - 1, b + 2, b - 2, b + 3, b - 3, b + 4, b - 4, b + 5, b - 5};
  for (unsigned sh = 1; sh < 3; ++sh)
    for (unsigned i = 0; i < 11u; ++i)
             (val[i] >> sh),
             logshiftr(val[i], sh),
             arishiftr(val[i], sh));

How many bits has a byte?

I recently stumbled about this seemingly silly question when trying to write a C macro that depends on the width of a type.

So everybody knows the short answer, 8, as is also expressed in the commonly used French word for byte: `octet’. But surprisingly for me the long answer is more complicated than that: it depends on the historical context, the norms that you apply, and even then you have to dig a lot of text to come to it.

C has a definition of the type char, and the language specification basically uses the terms char and  byte interchangeably.

Historically, in times there have been platforms with chars (and thus bytes) that had a width different from 8, in particular some early computers coded printable characters with only 6 bits and had a word size of 36. And later other constructors found it more convenient to have words of 16 bits to be the least addressable unit. C90 didn’t wanted to exclude such platforms and just stated

The number of bits in a char is defined in the macro CHAR_BIT
CHAR_BIT can be any value but must be at least 8

and even C99 still just states:

A byte contains CHAR_BIT bits, and the values of type unsigned char range from 0 to (2^CHAR_BIT) – 1.

But then, on the page for the include file stdint.h it states

The typedef name int N _t designates a signed integer type with width N, no  padding  bits,  and  a  two’s-complement representation. Thus, int8_t denotes a signed integer type with a width of exactly 8 bits.

So far so good, if there is an int8_t we can deduce that sizeof(int8_t) must be 1 and CHAR_BIT must be 8. But then the POSIX standard says

The following types are required:

Which forces CHAR_BIT to be 8, and basically also implies that at least for small width types the representation must be two’s-complement on any POSIX compatible platform.

More reading:

Some forum discussion
The POSIX specification of stdint.h limits.h