I recently reviewed some document on security recommendations where I was baffled by the fact that the code examples were sprinkled with casts all over the place. I had thought that people that are concerned with software security in mind would adhere to one of the most important rules in C programming:
Casts are harmful and evil.
The “evil” here is to be read as reference to black magic. Most uses of cast are merely done in the spirit of “casting a spell” by people that try to quieten their compiler. The sorcerer’s apprentice approach: if I don’t see the evil, it isn’t there.
For me it is evident that every cast punches a hole in C’s type system. So, concerned with code security, we should avoid them as much as possible. Since in turn avoiding casts in C is either trivial or a sign of really bad design, secure code just shouldn’t have them. Where it is trivial, just don’t do it; where there is bad design, you’d have to change the design in any case. But this evidence doesn’t yet seem shared (meaning that it is not so evident :) and I decided to explain things here in more detail.
Casts (explicit conversions) in C come with three different flavors, depending on the cast-to and cast-from type
- pointer to pointer
- pointer to integer or vice versa
- integer to integer
As I showed int this post, using > as right angle brackets was not a particularly good idea, but trying to patch this misdesign even makes it worth. After a bit of experimenting I found an expression that is in fact valid for both, C++98 and
C++11, but that has a different interpretation in both languages:
fon< fun< 1 >>::three >::two >::one
So if you have to maintain a large code base with templates that depend on integers that are perhaps produced automatically by some tools, be happy, you will not be out of work for a while: changing your compiler to
C++11 might change the semantics of your code.
It is long time that I didn’t look into C++, I have to admit. By coincidence I recently unearthed a hilarious example that I had once written that shows the difficulty of parsing some C++ code, as well as for compilers as for us poor humans. It all starts with the
>> operator that (supposedly until C++11) could cause problems as in the following:
toto< tutu< 3 >> A;
Here the >> is (was) interpreted as `right shift’ operator and thus this code would create a compile time error. C++11 changed this by introducing the possibility that in that case the right-shift-operator-token closes the two template angle brackets. The argument is that shift operators in template arguments are rare (which is probably true) and so this sacrifices some valid uses of that operator for the sake of causing less brain damage to C++ newbies.
Let’s take the occasion of the change back from DST here in Europe, not in the US, yet, to look how times are handled in C.
The C standard proposes a large variety of types for representing times:
double and textual representations as
char. It is a bit complicated to find out what the proper type for a particular purpose is, so let me try to explain this.
The first class of “times” can be classified as calendar times, times with a granularity and range as it would typically appear in a human calendar, as for appointments, birthdays and so on. Some of the functions that manipulate these in C99 are a bit dangerous, they operate on global state. Let us have a look how these interact:
Sometimes in C it is useful to distinguish if an expression is an “integral constant expression” or a “null pointer constant”. E.g for an object that is allocated statically, only such expressions are valid initializers. Usually we are able to determine that directly when writing an initializer, but if we want to initialize a more complicated
struct with a function like initializer macro, with earlier versions of C we have the choice:
- Use a compiler extension such as gcc’s
- We’d have to write two different versions of such a macro, one for static allocation and one for automatic.
In the following I will explain how to achieve such a goal with C11′s
_Generic feature. I am not aware of a C++ feature that provides the same possibilities. Also, this uses the ternary operator (notably different in C and C++), so readers that merely come from that community should read the following with precaution.
Again I had a discussion with someone from a C++ background who claimed that one should use signed integer types where possible, and who also claimed that the unsignedness of
size_t is merely a historical accident and would never be defined as such nowadays. I strongly disagree with that, so I decided to write this up, for once.
What I write here will only work with C, and can possibly extended to C++ and other languages that implement unsigned integer types, e.g good old Pascal had a
Somewhat hidden in Annex K, C11 introduces a new term into the C standard, namely runtime-constraint violations. They offer an important change of concept for the functions that are defined in that annex: if such a function is e.g called with invalid parameters, a specific function (called runtime-constraint handler) is called, that could e.g abort the program, or just issue an error message. This is in sharp contrast to the runtime error handling in the rest of the C standard, where the behavior under such errors is mostly undefined (anything may happen then) or sometimes reported to implementation defined behavior (and thus poorly portable and predictable).
Annex K, obscurely coined “Bounds checking interfaces“, introduces some
typedef and a series of replacement functions for many C library functions. The function names in this series are usually derived from the name of the function they replace and by adding the suffix
_s to the function name, e.g the function
qsort gets a “secure” twin interface called
qsort_s, as we have seen in an earlier post.
I recently started to implement parts of the “Bounds checking interfaces” of C11 (Annex K) for P99 and observed a nice property of my implementation of
qsort_s. Since for P99 basically all functions are inlined, my compilers (gcc and clang) are able to integrate the comparison functions completely into the sorting code, just as an equivalent implementation in C++ would achieve with
A while ago I already have written about Linux futexes as a really nice concept for a control data structure that goes beyond the ones that we learn or teach in school (mutex, semaphore, condition variable…). I have now gone one step further and integrated futexes into P99; if used on Linux this will evidently use the corresponding Linux feature under the hood, on other platforms a C11 thread implementation using mutexes and condition variables can be used.
One of the real disadvantages of most of the control structures is that they have two very different kinds of events: user events (e.g a call to
cnd_signal) and system events, often called “spurious wakeups”. Unless we program system code, these spurious wakeups are just an annoyance. They are easily forgotten during development and lead to subtle bugs that only appear on heavy load or when changing the platform and handling them often makes the user code overly complex.
p99_futex are designed to work around this type of problems, by still providing a close integration of the control structure into the system and by efficiently distinguishing a “fast path” for operations from a “slow path” where we handle congestion. They provide a counter similar to a conditional variable that allows atomic increments and to wait for it, just as the Linux system call does. (Only that for ideological reasons the base type is an
unsigned, instead of an
int as in Linux.)
The C11 has added an attempt to force compilers to initialize padding of structures and unions under certain circumstances. Unfortunately the situation has become confusing now, since it still foresees that padding can be treated differently from other parts of structures that are not initialized explicitly.