Demystify undefined behavior

In discussions about C or other standards (e.g POSIX) you may often have come across the term undefined behavior, UB for short. Unfortunately this term is sometimes mystified and even used to scare people unnecessarily, claiming that arbitrary things even outside the scope of your program can happen if a program “encounters undefined behavior”. (And I probably contributed to this at some point.)

First, this is just crappy language. How can something “encounter” behavior?  It can’t. What is simple meant by such slang is that such a program has no defined behavior, or stated yet otherwise, the standard doesn’t define what to do if such a program reaches the state in question. The C standard defines undefined behavior

behavior, upon use of a nonportable or erroneous program construct or of erroneous data,
for which this International Standard imposes no requirements

That’s it.

Simple examples of undefined behavior in C are out-of-bounds access to arrays or overflow of signed integer types. More complicated cases arise when violating aliasing rules or when access to data races between different threads.

Now we often hear statements like “UB may format your hard drive” or “UB may steal all your money from your bank account” implying somehow that a program that is completely unrelated to my disk administration or to my online banking, could by some mysterious force have these evil effects. This is (almost) complete nonsense.

If UB in a simple program of yours formats your hard drive, you shouldn’t blame your program. No simple application run by an unprivileged user should have such devastating consequences, this would be completely inappropriate. If such things happen, it is your system which is at fault, get you another one.

As an analogy from every-day life, take the idea of locking your house at night, which seems to be the rule in some societies. Sure, if you don’t do that, you make it easier for somebody to sneak into your house and shoot you.   But still, that person would be a murderer, to be punished for that by the law, no sane legal system would acquit such a person or see this as mitigating circumstances.

Now things are in fact different when there is a direct relation between the object of your negligence and the ill-effect that it causes. If you leave your purse on the table in the pub when going to the bathroom, you should take a share in responsibility if it is stolen. Or to come back to the programming context, if you are programming a security sensible application (e.g handling passwords or bank credentials) then you should be extremely careful and stay on well-defined grounds.

When programming in C there are different kinds of constructs or data for which the program behavior then is undefined.

  • Behavior can be explicitly declared undefined or implicitly left undefined.
  • The error can be detectable or undetectable at compile time.
  • The error can be detectable or undetectable during execution.
  • Behavior of certain constructs can be undefined by a specific standard to allow extensions.

Unfortunately, people often use UB as an excuse to evacuate questions about certain behavior of their code (or compiler). This is just rude behavior by itself, and you usually shouldn’t accept this. An implementation should have reasons and policies how it deals with UB, otherwise it is just bad, and you should switch to something more friendly, if you may.

There is no reason not to be friendly, and there are many to be.

That is, in our context, once a (your!) code detects that it has a problem it must handle it. Possible strategies are

  • abort the compilation or the running program
  • return an error code
  • report the problem
  • define and document the behavior in your own terms

For the first you should always have in mind that the program should be debugable, so you really should use #error or static_assert for compile time errors and assert and abort for run time errors. Also, willingly aborting the program is not the same as having the program crash, see below.

Obviously, the second is only possible if the function in question has an established error return convention and if you can expect users of the function to check for that convention. POSIX has many such cases where the documentation says something like “may” return a certain error code. A C (or POSIX) library implementation that detects such a “may” case and doesn’t react accordingly, is of bad quality.

Reporting detected errors is an important alternative and some compiler implementors have chosen this as their default answer to problems. Perhaps you already have seen gcc’s famous “diagnostic” message

dereferencing pointer ‘bla’ does break strict-aliasing rules

This message supposes that you know what aliasing rules are, and that you also know why it came to that. Observe that it says “does” so the compiler claims to have proven that the aliasing rules are violated, and that the behavior is thus undefined. What the message doesn’t tell, and that is bad, is how it resolves the problem.

The last variant, defining your own stuff, should be handled with extreme care. Not only that what you define must be sensible, you also make a commitment in doing so. You promise your clients that you will follow these new rules in the future and that they may suppose that you will take care of the underlying problem. In most cases, you should leave the definition of extensions to the big players, platform designers or other standards. E.g POSIX defines a lot of cases that are UB for C as such.

As a last alternative, when

  • the error is undetectable
  • the detection of the error would be too expensive

you should simply let the program crash. The best strategy I know of is to always initialize all variables and always treat 0 as a special value. This may, under rare circumstance deal a tiny bit of performance against security, because on some rare occasions your compiler might not be able to optimize an unused initialization. If you do this, most errors that you will see are dereferences of null pointers.

Dereferencing a null pointer has UB. But modern architecture no how to handle this: they raise a  “segmentation fault” error and terminate the program. This is the nicest failure path that you can offer to your clients, failing anytime, anywhere.

P99 futexes: non-blocking integer valued condition variables

A while ago I already have written about Linux futexes as a really nice concept for a control data structure that goes beyond the ones that we learn or teach in school (mutex, semaphore, condition variable…). I have now gone one step further and integrated futexes into P99; if used on Linux this will evidently use the corresponding Linux feature under the hood, on other platforms a C11 thread implementation using mutexes and condition variables can be used.

One of the real disadvantages of most of the control structures is that they have two very different kinds of events: user events (e.g a call to cnd_signal) and system events, often called “spurious wakeups”. Unless we program system code, these spurious wakeups are just an annoyance. They are easily forgotten during development and lead to subtle bugs that only appear on heavy load or when changing the platform and handling them often makes the user code overly complex.

p99_futex are designed to work around this type of problems, by still providing a close integration of the control structure into the system and by efficiently distinguishing a “fast path” for operations from a “slow path” where we handle congestion. They provide a counter similar to a conditional variable that allows atomic increments and to wait for it, just as the Linux system call does. (Only that for ideological reasons the base type is an unsigned, instead of an int as in Linux.)

Continue reading “P99 futexes: non-blocking integer valued condition variables”

C11 defects: C threads are not realizable with POSIX threads

This post is mainly identical to a defect report that hopefully will be discussed by the C standards committee on their next meeting. I found that problem that this raises needs to be better known before people start using this interface more widely, so I decided to also publish it here.

The thread interfaces as they are declared in the threads.h header are largely underspecified, such that interpreting them is often just guess-work and leaves room for a wide range of interpretations. This is particularly irritating since there already is an ISO standard about threads that is quite elaborated and mature, namely ISO/IEC 9945:2009, commonly know as POSIX 2010. C11 mentions ISO/IEC 9945:2009, but completely misses to technically relate to it on the thread interface. The semantic specification of C11 threads is in parts so loose, that a stringent implementation of C11 threads on top of POSIX doesn’t seem possible.

Other platforms that are less formalized than POSIX have their own technical restrictions that should additionally be taken into account. The “other platform” for threads that clearly had been targeted by the committee are threads on Microsoft Windows platforms. Most other widely used commodity operating systems are POSIX compatible (from mainframes down to Android phones). But we should not underestimate the potential of the C threads interface. Because it has a reduced interface it might be suitable for a larger range of platforms than we can foresee today. Because C threads don’t enforce a complete share of the address space, such platforms could e.g be accelerators (providing a portable thread interface on GPU?) or networks on chips. The only memory that must be shared by C threads are objects with static storage duration and objects allocated through malloc and friends. Thus freestanding environments without malloc would only be required to shared statically allocated objects.

In the following I only give an incomplete list of the defects as I noticed them, I suspect that there might be a lot of others.

Continue reading “C11 defects: C threads are not realizable with POSIX threads”

emulating C11 threads through POSIX threads

This month a new version of the C standard, C11, has been published. It brings much less changes to C than C99 and hopefully it will be adopted more easily than that. But it will probably still take some time until the major compilers implement it.

Some of the changes that it brings are minor and can already be simulated within C99 or by extensions that some compilers implement already. Some of them already are emulated by P99. I’ll probably post about them one of these days.

The most important to me seem to be two optional additions to the C library: threads and atomic operations.

Continue reading “emulating C11 threads through POSIX threads”

socket programming and C99

Network socket programming (BSD sockets) is one of the dark arts, the point where legacy interfaces can become a real burden to the programmer. In this post I will propose some hints and techniques that might this a bit easier, provided your compiler is complying to C99. Since it seems that there has been some misunderstanding about this post: this is not a tutorial or introduction to socket programming. This is to show how C99 can help to deal with a badly designed legacy interface.

If you are interested in socket programming as such, you’d probably better look elsewhere. There is a lot of information available on the web. On the other hand if you don’t know much about sockets (of even nothing at all) you can just take this as an example of a type of interface with which you might be confronted when programming in C.
Continue reading “socket programming and C99”

linux futexes: non-blocking integer valued condition variables

POSIX’ condition variables pthread_cond_t unfortunately have some drawbacks:

  • They require the use of at least two separate other objects that are only loosely coupled with the condition variable itself:

    • A plain variable, say X of type int, that actually holds the value of which the condition depends.
    • A mutex that is just to regulate the access to X and to the condition variable.
  • Generally they are not lock free. To access X we have to

    • lock the mutex
    • inspect X
    • eventually wait for the condition to become true
    • do changes
    • eventually signal other waiters about the changes we made
    • unlock the mutex

Linux’ concept of futexes allows to associate an equivalent concept directly to the variable X and to access this variable without taking locks directly in most of the cases.
Continue reading “linux futexes: non-blocking integer valued condition variables”

Why sem_t is not suited as an atomic counter

C has itself no primitive for atomic access to variables. Whenever we want to deal with some sort of counter in a threaded environment (e.g for reference counting) we have thus to be careful not to erase data that other threads might have written or be just in the course of reading.

sem_t as a counter

At a first glance, sem_t seems to fit into that need. It has an increment function sem_post, a decrement sem_trywait and a value test with sem_getvalue. If we are careful about interrupt handling (see this post) this should give us access to a “good” counting device, one might think.

First if we would like to do this an implementation of a reference counter would be relatively tedious. sem_t is meant to be waited for if the counter is 0, but here we would need it the other way round: we would need to wait until the counter falls to 0. Nevertheless this is doable, not with one semaphore but with several.

Side effects

What is more annoying is a another defect of sem_t its ill-conceived API, namely its tendency to have side effects to the system call. The sem_trywait call can be used to decrement a counter without waiting, and in particular to know once the counter has reached 0. If the counter has been decremented, so it was not 0 before, sem_trywait returns 0, otherwise it returns -1.

But the later case has a side effect: errno is set to an error code. So in that case, the caller has not only to do what he has to do but also has to do repair work, namely reset errno to 0. This may sound harmless, but it isn’t: errno is not a simple variable but an

expression which expands to a modifiable lvalue that has type int

This must be so, since errno is guaranteed to be different per thread, one thread’s errno wouldn’t influence the one of another thread. C has no concept of thread-only variables, therefore the system usually as to deploy quite a machinery to achieve a suitable mechanism.

So generally the use of sem_t might turn out to be quite inefficient for such a purpose.

Alternatives

If the API for the function would have gone a different way, namely in returning the precise error code instead of -1, this would be much simpler to handle. No access to pseudo-global variables would be necessary, no side effects would take place.

This strategy has been chosen for the other POSIX lock primitives such as pthread_mutex_t or pthread_rwlock_t.
Both are suitable for the purpose. An implementation with one pthread_mutex_t, one pthread_cond_t and an unsigned should be relatively obvious.

pthread_rwlock_t is even directly suitable for a locking structure for resource control. Just have all users of a resource issue a pthread_rwlock_rdlock when they start to use it and a pthread_rwlock_unlock when they cease to do so. If the resource is unused may then simply tested by a call to pthread_rwlock_wrlock.

Avoiding name conflics for libraries

C programs usually use quite a lot of libraries with a lot of predefined names. These libraries may be coming from different origins, they may be part of the C runtime or your operating system (I’ll assume that this is POSIX), from third parties or be written by yourself or your collaborators. Conflicting names are a nasty thing to track down and may require big changes if they are detected too late.

Reserved Identifiers

All systems define a whole bunch of names that go far beyond just the keywords of the language. See the links that are provided below if you have a doubt for an individual name. Nevertheless there are systematic patterns that you should avoid for portability to other (future?) systems that you don’t have your hand on.

External Symbols in File Scope

All identifiers in file scope that start with an underscore _ are reserved for the runtime. So don’t use an identifier that could conflict with that rule, in particular

  • No globally visible identifier should start with an underscore. This concerns functions, variables, typedefs and enum constants.
  • Avoid names in tag space (struct union or enum) with an underscore. Although they are not in direct conflict with external symbols it might be difficult to find errors because of that.
  • Don’t #define names starting with an underscore. They might overwrite system symbols.

Reserved future keywords and preprocessor macros

Names starting with an underscore followed by a capital letter or a second underscore are reserved for future use as keywords and macros. This is e.g what happened during the extension of C from C89 to C99, where the keyword _Bool was included. Since this type of name was reserved beforehand, the standards body was free to chose this name for the new Boolean data type. The typedef bool that most people use is only defined in the standard header file stdbool.h. Only when including this header a conflict with that name (that had not been reserved previously) could occur.

Thus, in the inside of functions or as components of struct or union names with underscores would be ok, in general, as long as they don’t have a capital letter or a second underscore in second position.

Reserved future types

POSIX reserves all other names that end in _t for future use as types. Don’t use them, they are ugly anyhow.

Chose a prefix

If you expect/estimate/dream that the library that you are designing will be use a lot by others stick to a simple naming convention: chose a prefix, like demo_. This makes identifier conflicts easy to track and avoids potential conflicts with macro definitions that somebody else might use.

  struct demo_coucou {
     unsigned demo_age;
     double demo_weight;
  };

Observe that this also uses the prefix for individual fields in the struct. If we would just use age or weight they might conflict with some code that uses this as names of #define

Compatablity of C headers in C++

In C++ struct and union tags can be used as type identifiers iff they are not used in the identifier namespace. Using tag names and identifiers for different kinds of objects/concepts is a bad idea anyhow, so avoid that. I think the best way to do so is to typedef all struct and union types to the same name.

typedef struct demo_coucou demo_coucou;

For C, this allows you to do a forward declaration of struct demo_coucou and of demo_coucou in just one line. And for C++ this is valid, too, it just makes their convention of the implicit typedef explicit. As an additional advantage it hinders you to accidentally declare another identifier of that name yourself, which would much perturb C++ when it sees your declaration.

Further reading

how to make sem_t interrupt safe

The POSIX semaphore wait calls

int sem_wait(sem_t *sem);
int sem_trywait(sem_t *sem);
int sem_timedwait(sem_t *sem, const struct timespec *abs_timeout);

can be interrupted at any time, e.g by IO that is delivered or if a process child terminates. In such a case errno is set to EINTR:

ERRORS
EINTR The call was interrupted by a signal handler; see signal(7).

NOTES
A signal handler always interrupts a blocked call to one of these functions, regardless of the use
of the sigaction(2) SA_RESTART flag.

(Side note, sem_post is always atomic in that sense and will never return EINTR.)

So if we don’t use sem_t in a signal handler context and don’t check for EINTR on return of the wait, the whole semaphore approach becomes error prone. In particular there are good chances that with small test cases during development everything runs fine, but that then during production byzantine errors occur once in a while that will be very hard to track.

To test for EINTR systematically it is easy to write wrappers that just re-issue the corresponding wait whence the return indicates that the call was interrupted.

static inline
int sem_wait_nointr(sem_t *sem) {
  while (sem_wait(sem))
    if (errno == EINTR) errno = 0;
    else return -1;
  return 0;
}

static inline
int sem_trywait_nointr(sem_t *sem) {
  while (sem_trywait(sem))
    if (errno == EINTR) errno = 0;
    else return -1;
  return 0;
}

static inline
int sem_timedwait_nointr(sem_t *sem, const struct timespec *abs_timeout) {
  while (sem_timedwait(sem, abs_timeout))
    if (errno == EINTR) errno = 0;
    else return -1;
  return 0;
}

Since they use the inline keyword, the variants given here are only suitable for C99, but it should be easy for you to adapt them for other contexts. The inline here has the advantage that the call can be inlined efficiently at the caller side, basically resulting in the call of the real POSIX function in question plus some conditional jumps.

BTW, the evaluation of errno here has some thread magic. The POSIX standard guarantees that it must behave as if it were a local variable for each thread, so we don’t have to worry about concurrent access to it.