Don’t use casts

I recently reviewed some document on security recommendations where I was baffled by the fact that the code examples were sprinkled with casts all over the place. I had thought that people that are concerned with software security in mind would adhere to one of the most important rules in C programming:

Casts are harmful and evil.

The “evil” here is to be read as reference to black magic. Most uses of cast are merely done in the spirit of “casting a spell” by people that try to quieten their compiler. The sorcerer’s apprentice approach: if I don’t see the evil, it isn’t there.

For me it is evident that every cast punches a hole in C’s type system. So, concerned with code security, we should avoid them as much as possible. But this evidence doesn’t yet seem shared (meaning that it is not so evident 🙂 and I decided to explain things here in more detail.

Casts (explicit conversions) in C come with three different flavors, depending on the cast-to and cast-from type

pointer to pointer
pointer to integer or vice versa
integer to integer

Today I will talk about the first case, explicit pointer to pointer conversion, and here I will try to go from the more to the less severe. Less severe are in particular casts that involve void* and pointers to character types, so I will discuss them later.

Casting away type

The most severe use of cast is what would be called a reinterpretation pointer cast in C++, followed by a dereference operator that then is used on the left side of an assignment:

  
double A = 0.0;               /* An extremely bad code example */
(*(unsigned long*)&A) = 0x55433333UL;     /* alignment problem */
printf("modified double value %A\n", A);  /* aliasing  problem */

Besides being barely readable, such code has two major problems. First, data of different types can have different alignment properties, and reading or modifying a value through an inappropriate pointer easily leads to a “bus error”, a runtime alignment error that aborts the program. Second, C’s aliasing rules state that in the above example the printf call can pretend that variable A wasn’t modified. The optimizer is only forced to follow modifications through memory that is properly typed. The case being, A could simply still be assumed to be 0.0, the whole exercise would be void.

A bit less severe is just reading such a value

  
double a = 45.0;            /* A very bad code example */
printf("bit interpretation %#lX\n",
       *(unsigned long*)&a);      /* alignment problem */

Here, the alignment problem remains.

There are simple cures for this, depending on the use case. But the very first question that you should ask yourself when you are tempted to do such nasty things:

Is my approach useful, secure, portable and maintainable?

The only acceptable reason that I came up when thinking of this was bit manipulation or inspection.
Fiddling around with bit representations of values may, sometimes, make sense, but then they should be implemented with greatest precaution. As we are then in the representation domain and not in the value domain, much of what is done here is platform dependent and not portable.

The first problem that the above bad code addresses is the manipulation of the representation of floating point values. Here the C library has a whole bunch of proper tools that can help:

The fpclassify, isfinite, … macros to asses if a value is normal, subnormal etc.
The signbit macro for the value of the sign bit, in cases that a simple comparison to 0.0 doesn’t do the trick
The type generic function or macro frexp to split a floating point value into its mantissa and exponent.
The type generic function or macro ldexp to load a specific exponent into a floating point value.
The printf and scanf format specifier “%a” for lossless output or input of floating point values.

In any context, the possible small performance penalty of using these tools shouldn’t be relevant. (If you use such hackery in performance critical code, you have a severe design problem. Stop it, and think.)

If other types than floating point are involved (or you just are sure that you want to set an object to a certain bit pattern) a portable way to do such inspections are unions. The myth that using unions leads to undefined behavior is tenacious, probably because it is UB in C++. C is different.

  
union {                                      /* A good code example */
  double A;
  unsigned long B;
} A = {
  .B = 0x55433333UL,                        /* no alignment problem */
};
printf("modified double value %A\n", A.A);  /* no aliasing  problem */

The other general possibility foreseen by the C standard is to pass through memcpy:

  
double A;                                  /* A good code example */
unsigned long B = 0x55433333UL;
memcpy(&A, &B, sizeof B);                 /* no alignment problem */
printf("modified double value %A\n", A);  /* no aliasing  problem */

Both methods still suppose that you know what you are doing, namely that the bit pattern you impose is valid.

Casting away constness

This is a case that is a bit more difficult to handle, since in many of the cases where the need for such a cast seem to arise, there is effectively a conceptual problem. A function that receives a const qualified object, should not change that object. period.

Two special cases arise for that, though.

the underlying type needs a certain field to be modifiable, even to read the contents of the object. In C++ we could model such a situation by declaring a field mutable, in C we don’t have that concept.
Pointers to multidimensional arrays with a const qualified base type.

For the first there is a work around by modeling a “mutable” field through a pointer type

struct shnuck {
  atomic_flag* flg;
  double value;
};
#define TOTO_INITIALIZER(VAL) {            \
    .flg = &(atomic_flag)ATOMIC_FLAG_INIT, \
    .value = (VAL),                        \
}

And by using a compound literal to initialize that field.

I don’t expect this method to be adopted widely. We’d have to handle objects differently according to their storage class, an approach that is not easy to design and to follow.

I have already discussed the second in an earlier post. Unfortunately, this hits one of the design flaws of the C language itself, not much that we can do about it. Basically this design flaw of the language has an effect that for functions that receive multi-dimensional arrays, the base types of these array can’t be const-qualified.

Reinterpreting a representation as raw bytes (`unsigned char*`)

There is one fallback type for the reinterpretation case above that is guaranteed to work: unsigned char. By definition, there are no alignment difficulties when we cast to a
unsigned char.

  
double A = 0.0;                            /* An ugly code example */
((unsigned char*)&A)[0] = 0x55;           /* no alignment problem */
((unsigned char*)&A)[1] = 0x43;
((unsigned char*)&A)[2] = 0x33;
printf("modified double value %A\n", A);   /* no aliasing  problem */

Here also the union approach leads to nicer code:

union {
  unsigned char B[sizeof(double)];
  double A;
} A = {
   B = {
        [0] = 0x55,                         /* no alignment problem */
        [1] = 0x43,
        [2] = 0x33,
       },
};
printf("modified double value %A\n", A.A);   /* no aliasing  problem */

Again, for both you would have to know what you are doing: here this concerns the values for the bytes and the endianess of the machine. Be careful.

In any case, here also the version with casts can be avoided.

Stripping an object of its type, or inventing a new one (`void*`)

I don’t really understand the crying need by some people to cast void* values back and forth.

Converting a pointer value from void* to some pointer to a complete type provides a (new) type to the underlying object.
Converting a pointer value to void* strips any type interpretation from the object

There are different context in which void* appear “naturally”

Dealing with allocation functions, malloc and friends
Dealing with type generic interfaces.

Allocation

For the first there is a general confusion among many, that feel the need of casting the return of malloc to the target type.

This has no real advantage since the target type is known anyhow.
This can be source of subtle bugs.

The later is a severe security issue, because a cast is not only a cast to some desired type, but also a cast away from the undesired. So you’d better be pretty sure about the source type. If you forgot to include the header declaration for malloc, the return is interpreted by many compilers as int, the most-significant part of the pointer value can be dropped, in short, a catastrophe. All of this for an unspecific gain.

One source for this confusion may be the common idiom

toto* p;
...
p = malloc(sizeof(toto));

Since this places the first assignment far from the definition of the pointer, our stressed programmer maybe is not so sure anymore what type he had given to p so he tries to be smart and forces the object to be interpreted as toto*.

p = (toto*)malloc(sizeof(toto)); /* bad, double redundancy isn't security */

So the first error lies in the fact to separate definition of a variable from its initialization. This is merely a question of style and readability and not of security but already

toto* p = malloc(sizeof(toto)); /* still violates one definition rule */

is much easier to capture. But, at the end there is a far superior solution that doesn’t need any redundancy:

toto* p = malloc(sizeof *p); /* one definition of type and size only */

So the real problem is that people forget that sizeof is an operator and not a function, and they hide that foolery by casting a spell.

Type generic interfaces

There are several type generic interfaces in the C standard (and also in POSIX) that deal with application provided functions with void* or void const* parameters. Examples are

qsort and qsort_s
bsearch and bsearch_s
thrd_create
tss_create

Let us look into fictive comparison function for qsort, where a sloppy programmer has forgotten about the const qualification:

int toto_cmp(void const* a, void const* b) {
  toto* A = (toto*)a;   /* may lead to undefined behavior, bad */
  toto* B = (toto*)b;   /* may lead to undefined behavior, bad */
  ...
  /* no protection against unintentional modification here */
  int ret = (*A = *B);  /* segfault in case of immutable storage */
  ...
  return ret;
}

Eliminating the cast uncovers the first bug:

int toto_cmp(void const* a, void const* b) {
  toto* A = a;      /* constraint violation, good */
  toto* B = b;      /* constraint violation, good */
  ...
  /* no protection against unintentional modification here */
  int ret = (*A = *B);
  ...
  return ret;
}

Repairing the first bug leads us to the second one:

int toto_cmp(void const* a, void const* b) {
  toto const* A = a;      /* implicit conversion, good */
  toto const* B = b;      /* implicit conversion, good */

  /* no protection against unintentional modification here */
  int ret = (*A = *B);    /* constraint violation, good */
  ...
  return ret;
}

And finally to the intended code

int toto_cmp(void const* a, void const* b) {
  toto const* A = a;      /* implicit conversion, good */
  toto const* B = b;      /* implicit conversion, good */
  ...
  int ret = (*A == *B);    /* just a comparison, good */
  ...
  return ret;
}

So here again, we are able to reduce the region of the code where the conversion takes place to a minimum. And we are able to make the conversion explicit by using an implicit conversion in a controlled context.

9 thoughts on “Don’t use casts”

Ulterior says:

April 2, 2014 at 02:58

every new article is a revelation of something I was using improperly, thanks Jens
tupster says:

April 19, 2014 at 05:03

You could just declare cmp_toto as:

int cmp_toto(toto const* a, toto const* b);
1. Jens Gustedt says:
  
  April 19, 2014 at 07:11
  
  (first it is toto_cmp and not cmp_toto, I always use a prefix to indicate some sort of name spacing.)
  
  No,
  
  int toto_cmp(toto const*a, toto const*b);
  
  can’t work with qsort and friends. The type of the function would not be compatible with that interface, these kind of interfaces in C don’t have any type information. So you’d have to tell them somehow.
asl says:

June 4, 2014 at 11:59

What about the cast between signed and unsigned type? How can we remove them? Like in:

int h(int n, unsigned int max) {
return (n >= 0 && (unsigned int) n < max);
}
1. Jens Gustedt says:
  
  June 4, 2014 at 13:48
  
  These kind of casts should be subject of a second article in this series, on integer to integer casts. I just didn’t manage to write it, yet 😦
  [Or should I say, I just managed not to write it yet?]
  
  In fact, because you test for n >= 0, here in this special case, you just don’t cast. The implicit conversion rules are supposed to do the right thing.
  
  If your compiler complains, this is a different thing, it is clearly a bug of your compiler. He simply gets things wrong: a conversion of a positive signed integer to the corresponding unsigned type is always safe.
  1. ig says:
    
    July 2, 2014 at 13:52
    
    So why can’t gcc or llvm/clang figure out that it’s always positive so the cast isn’t needed?
    1. Jens Gustedt says:
      
      July 2, 2014 at 14:23
      
      The versions of gcc and clang that I have on my machine don’t need the cast. Do you observe something different?
      1. ig says:
        
        July 3, 2014 at 01:47
        
        Oh I was talking about compiling the code above by invoking gcc and clang with the -Wsign-compare flag, which will give a warning. Isn’t that the case in your versions? I’d like them to not emit warnings in these cases even with the warning flag on.
        
        $ cat test.c int h(int n, unsigned int max) { return (n >= 0 && n < max); } $ gcc -Wsign-compare -c test.c test.c: In function ‘h’: test.c:2:22: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] return (n >= 0 && n < max); ^ $ clang -Wsign-compare test.c -c test.c:2:22: warning: comparison of integers of different signs: 'int' and 'unsigned int' [-Wsign-compare] return (n >= 0 && n < max); ~ ^ ~~~ 1 warning generated.
        
        I was asking why the compilers aren’t smart enough to omit generating these warnings where the code proves that it shouldn’t be a problem. I’m mostly using the integer-integer casts in these cases to suppress warnings.
        
        Jens Gustedt says:
        
        July 3, 2014 at 19:34
        
        No I never use the warnings that come with -Wextra, only -Wall. These extra things just add noise. So they aren’t even capable to detect that case here, that makes them pretty useless.