Today, Rich Felker has published the next release of musl the lightweight, standard conforming C library. He says:
October 14, 2014
April 2, 2014
I recently reviewed some document on security recommendations where I was baffled by the fact that the code examples were sprinkled with casts all over the place. I had thought that people that are concerned with software security in mind would adhere to one of the most important rules in C programming:
Casts are harmful and evil.
The “evil” here is to be read as reference to black magic. Most uses of cast are merely done in the spirit of “casting a spell” by people that try to quieten their compiler. The sorcerer’s apprentice approach: if I don’t see the evil, it isn’t there.
For me it is evident that every cast punches a hole in C’s type system. So, concerned with code security, we should avoid them as much as possible. But this evidence doesn’t yet seem shared (meaning that it is not so evident 🙂 and I decided to explain things here in more detail.
Casts (explicit conversions) in C come with three different flavors, depending on the cast-to and cast-from type
- pointer to pointer
- pointer to integer or vice versa
- integer to integer
October 28, 2013
Let’s take the occasion of the change back from DST here in Europe, not in the US, yet, to look how times are handled in C.
The C standard proposes a large variety of types for representing times:
double and textual representations as
char. It is a bit complicated to find out what the proper type for a particular purpose is, so let me try to explain this.
The first class of “times” can be classified as calendar times, times with a granularity and range as it would typically appear in a human calendar, as for appointments, birthdays and so on. Some of the functions that manipulate these in C99 are a bit dangerous, they operate on global state. Let us have a look how these interact:
August 22, 2013
Sometimes in C it is useful to distinguish if an expression is an “integral constant expression” or a “null pointer constant”. E.g for an object that is allocated statically, only such expressions are valid initializers. Usually we are able to determine that directly when writing an initializer, but if we want to initialize a more complicated
struct with a function like initializer macro, with earlier versions of C we have the choice:
- Use a compiler extension such as gcc’s
- We’d have to write two different versions of such a macro, one for static allocation and one for automatic.
In the following I will explain how to achieve such a goal with C11’s
_Generic feature. I am not aware of a C++ feature that provides the same possibilities. Also, this uses the ternary operator (notably different in C and C++), so readers that merely come from that community should read the following with precaution.
July 15, 2013
Again I had a discussion with someone from a C++ background who claimed that one should use signed integer types where possible, and who also claimed that the unsignedness of
size_t is merely a historical accident and would never be defined as such nowadays. I strongly disagree with that, so I decided to write this up, for once.
What I write here will only work with C, and can possibly extended to C++ and other languages that implement unsigned integer types, e.g good old Pascal had a
February 4, 2013
Somewhat hidden in Annex K, C11 introduces a new term into the C standard, namely runtime-constraint violations. They offer an important change of concept for the functions that are defined in that annex: if such a function is e.g called with invalid parameters, a specific function (called runtime-constraint handler) is called, that could e.g abort the program, or just issue an error message. This is in sharp contrast to the runtime error handling in the rest of the C standard, where the behavior under such errors is mostly undefined (anything may happen then) or sometimes reported to implementation defined behavior (and thus poorly portable and predictable).
Annex K, obscurely coined “Bounds checking interfaces“, introduces some
typedef and a series of replacement functions for many C library functions. The function names in this series are usually derived from the name of the function they replace and by adding the suffix
_s to the function name, e.g the function
qsort gets a “secure” twin interface called
qsort_s, as we have seen in an earlier post.
December 4, 2012
I recently started to implement parts of the “Bounds checking interfaces” of C11 (Annex K) for P99 and observed a nice property of my implementation of
qsort_s. Since for P99 basically all functions are inlined, my compilers (gcc and clang) are able to integrate the comparison functions completely into the sorting code, just as an equivalent implementation in C++ would achieve with
November 21, 2012
A while ago I already have written about Linux futexes as a really nice concept for a control data structure that goes beyond the ones that we learn or teach in school (mutex, semaphore, condition variable…). I have now gone one step further and integrated futexes into P99; if used on Linux this will evidently use the corresponding Linux feature under the hood, on other platforms a C11 thread implementation using mutexes and condition variables can be used.
One of the real disadvantages of most of the control structures is that they have two very different kinds of events: user events (e.g a call to
cnd_signal) and system events, often called “spurious wakeups”. Unless we program system code, these spurious wakeups are just an annoyance. They are easily forgotten during development and lead to subtle bugs that only appear on heavy load or when changing the platform and handling them often makes the user code overly complex.
p99_futex are designed to work around this type of problems, by still providing a close integration of the control structure into the system and by efficiently distinguishing a “fast path” for operations from a “slow path” where we handle congestion. They provide a counter similar to a conditional variable that allows atomic increments and to wait for it, just as the Linux system call does. (Only that for ideological reasons the base type is an
unsigned, instead of an
int as in Linux.)
October 24, 2012
The C11 has added an attempt to force compilers to initialize padding of structures and unions under certain circumstances. Unfortunately the situation has become confusing now, since it still foresees that padding can be treated differently from other parts of structures that are not initialized explicitly.
October 14, 2012
This post is mainly identical to a defect report that hopefully will be discussed by the C standards committee on their next meeting. I found that problem that this raises needs to be better known before people start using this interface more widely, so I decided to also publish it here.
The thread interfaces as they are declared in the
threads.h header are largely underspecified, such that interpreting them is often just guess-work and leaves room for a wide range of interpretations. This is particularly irritating since there already is an ISO standard about threads that is quite elaborated and mature, namely ISO/IEC 9945:2009, commonly know as POSIX 2010. C11 mentions ISO/IEC 9945:2009, but completely misses to technically relate to it on the thread interface. The semantic specification of C11 threads is in parts so loose, that a stringent implementation of C11 threads on top of POSIX doesn’t seem possible.
Other platforms that are less formalized than POSIX have their own technical restrictions that should additionally be taken into account. The “other platform” for threads that clearly had been targeted by the committee are threads on Microsoft Windows platforms. Most other widely used commodity operating systems are POSIX compatible (from mainframes down to Android phones). But we should not underestimate the potential of the C threads interface. Because it has a reduced interface it might be suitable for a larger range of platforms than we can foresee today. Because C threads don’t enforce a complete share of the address space, such platforms could e.g be accelerators (providing a portable thread interface on GPU?) or networks on chips. The only memory that must be shared by C threads are objects with static storage duration and objects allocated through
malloc and friends. Thus freestanding environments without
malloc would only be required to shared statically allocated objects.
In the following I only give an incomplete list of the defects as I noticed them, I suspect that there might be a lot of others.