Jens Gustedt's Blog

August 17, 2016

Effective types and aliasing

Filed under: C11, C99, compiler optimization — Jens Gustedt @ 13:49

I already have posted about the evilness of cast some time ago, but recently I have seen that there is still much confusion out there about the C rules that pointer casts, also called type punning, break. So I will try to give it a more systematic approach, here.

In fact, I think we should distinguish two different sets of rules, the effective type rules that are explicitly written up in the C standard (6.5 p6 and p7), and the strict aliasing rules that are consequence of these, but that concern only a very special use case of type punning.  They represent only one of the multiple ways a violation of the effective type rules can damage the program state.

Note: To simplify things, I will not deal with qualifications such as const or volatile in the following, nor with the fact that types only have to be compatible to be correct.

Variables. For objects that have a declaration, that is all your plain, normal variables, the effective type rules are in fact quite simple.

Variables must be accessed through their declared type or through a pointer to a character type.

So taking the address of a variable with & and then cast the result to a different pointer type is always wrong.

float f = 55;
uint32_t trans = *(uint32_t*)&f;   // evil

Observe that there is no aliasing involved, there is only one pointer at any moment to the object f. There are many things that can go wrong on a particular platform with the above code: the alignment of f could be wrong for the new type, it could be allocated in a special memory section that speeds up floating point arithmetic, whatever. The C standard didn’t wanted to impose any restriction of what compilers can do with different data types, and so such access is simply forbidden. Don’t do it, there is no excuse for such evil code.

The exception that is explicitly allowed are character types. That is this code is ok

float f = 55;
unsigned char* bytes = (unsigned char*)&f;
printf("the first two bytes are %hhu and %hhu\n", bytes[0], bytes[1]);

That is inspecting the bytes of any object type is always permitted. In fact, if you are careful, you may even change the object representation of an object by changing its bytes through such a character pointer.

float f = 55;
uint32_t trans;
memcpy(&trans, &f, sizeof trans);   // ok

This is ok, because

  1. under the hood this reads f and modifies trans through character pointers
  2. the type uint32_t is such that all bit representations lead to a valid value.

Whereas accessing an arbitrary object through the byte angle is ok, the converse is not true:

Character arrays must not be reinterpreted as objects of other types.

Dynamically allocated objects. Objects that are not declared but allocated dynamically (e.g with malloc) fall under rules that are a bit weaker. This is so, because they initially have no type, and so we can’t request anything for it. This is where the notion of effective type comes into play. Such an allocated object is associate with a type, its effective type, once we write data with a known type into it. Such a “write” can happen through an assignment or through functions like memcpy.

A simple such case is if we access the object through a pointer with the sought type, and this should always be your preferred way to initialize a dynamically allocated object:

unsigned a = 55;
float* gp = malloc(sizeof *gp);
*gp = a;                          // *gp now is a float

Another is if we use memcpy from an object of known type:

float f = 55;
void* vp = malloc(sizeof f);
memcpy(vp, &f, sizeof f);        // *vp now is a float
float* gp = malloc(sizeof *gp);
memcpy(gp, &f, sizeof *gp);      // *gp now is a float

But by such copies we easily can get it wrong

float f = 55;
uint32_t* up = malloc(sizeof *up);
memcpy(up, &f, sizeof *up);                  // *up now is a float!
printf("the value is %" PRIu32 "\n", *up);   // bad: accessing float as uint32_t

So you should always be careful and ensure that an object actually has the correct type.

Change the type of an object. Where the type of a variable can never change, a dynamically allocated object actually can.

float f = 55;
void* vp = malloc(sizeof f);
memcpy(vp, &f, sizeof f);                   // *vp now is a float
float* fp = vp;
// do things with *fp
printf("the value is %g\n", *fp);           // ok
uint32_t u = 77;
memcpy(vp, &u, sizeof u);                   // *vp now is a uint32_t
uint32_t* up = vp;
printf("the value is %" PRIu32 "\n", *up);  // ok

Here, the second memcpy not only changes the values of each byte of the object, but it also changes its effective type.

Unions. C provides a simple tool to look at an object with different type views, a union. All this pointer casting garbage is really unnecessary if we just use the appropriate tool.

union both {
  float f;
  uint32_t u;
} b = {
  .f = 55,
};
printf("the value is %" PRIu32 "\n", b.u);   // ok

Here, we make it explicit that we want to see b as both, sometimes a float, sometimes a uint32_t. Now the compiler can take all precautions that they need to deal with that situation. This will be as efficient as the compiler can get it.

If you really need to reinterpret representations very often (but you shouldn’t) you can use a macro to do so:

#define CONV32(X) ((const union { float _f; uint32_t _u;}){ ._f = (X), }._u)

float f = 55;
printf("the value is %" PRIu32 "\n", CONV32(f));   // ok

This uses a temporary object of union type. Any decent compiler should realize this with out actually using a temporary object and will most probably just move a value from one hardware register to another. But this is their dealings, not ours.

Aliasing. Only if we have taken care of all of the above we actually come to aliasing problems. Aliasing is the property of two pointers that point to the same object and problems may occur with that if we change the underlying object through one of them.

In view of the effective type rule, C makes a simple assumption

Non-character pointers of different type cannot alias.

That is if a function see two pointers like here

void doit(float* fp, uint32_t* up) {
   *up = 3;
   *fp = 77;
   printf("the value is %" PRIu32 "\n", *up);   // may always print 3
}

it can always assume that the change of *fp doesn’t affect the object *up and thus that the value to be printed is 3, unconditionally. If you messed around with the types and passed pointers to the same object, you are on your own, even if, maybe, your type reinterpretation was correct through one of the exceptions that we discussed above.

So it is very likely that when you stumble into an aliasing problem with code as in this function, you already are in nowhere land of undefined behavior, because you had your types wrong from the beginning.

Wait, so why does my networking code actually work? Traditional networking code with its reinterpretation of socket address data is notorious for stretching the effective type rules to their very limits.

Consider the following typical network code snippet:

void fun(int fd) {
  socklen_t addrlen = sizeof(sockaddr_storage);
  void* p = malloc(addrlen);
  struct sockaddr_storage* sockp = p;     // *sockp has no type
  int ret = getsockname(fd, p, &addrlen); // *sockp now has type, but which?
  if (ret) { /* handle error */ }
  switch (sockp->ss_family) {             // valid access
  case AF_UNIX: {
    struct sockaddr_un* sockp = p;        // *sockp is a unix socket
    // do something
    break;
  }
  case AF_INET: {
    struct sockaddr_in* sockp = p;        // *sockp is an internet v4 socket
    // do something
    break;
  }
  }
  free(p);
}

This approach only works because a very specific chaining of events. The first important event is that the function getsockname provides a type for the object behind p (or maybe changes that type, if it had another one, before). Now we have learned above that this is only possible because that object is allocated dynamically.

Then, the next critical access is to read the ss_family member of that object. This is possible, because the definition of all sockaddr derivate types is such that they all have a member of the same type (sa_family_t) at exactly the same position in the structure. Thus reading sockp->ss_family obeys the effective type rules.

Finally, when we are inside the different cases, we know which of the types is the effective type and we can correctly access and deal with data of a unix or inet socket, for example.

So, all of this only worked because of a very precise sequencing of accesses and because the object had been allocated dynamically.

I personally prefer to go for less error prone networking code and to use a union:

void fan(int fd) {
  union {
    struct sockaddr_storage ss;
    struct sockaddr_un un;
    struct sockaddr_in in;
  } sock;
  socklen_t addrlen = sizeof sock;
  int ret = getsockname(fd, &sock, &addrlen);
  if (ret) { /* handle error */ }
  switch (sock.ss.ss_family) {             // valid access
  case AF_UNIX: {
    // do something with sock.un
    return;
  }
  case AF_INET: {
    // do something with sock.in
    return;
  }
  }
}

Such code is more robust, because we clearly announce that the object may have different interpretations, because it doesn’t depend on the storage class, and because we can’t forget to free the object at the end.

Advertisements

10 Comments

  1. Are you guaranteed that fields will line up in a union? In particular, in the case of uint32_t and float , could anything cause the uint32_t to be aligned differently? All this assumes that they are actually the same size. What if you used a uint8_t[4] array instead?

    Comment by Kyle Hayes — November 8, 2016 @ 00:30

    • Yes, the first field of a struct or union is always aligned with zero padding at the start. A union that would have placed its members with different alignment would not serve much purpose, I think.

      Comment by Jens Gustedt — November 25, 2016 @ 13:12

  2. Hi. You’re saying here that in the doit function in your example, fp and up can’t alias the same object. This twitter thread suggests they can if the object is dynamically allocated: https://twitter.com/spun_off/status/802673731218776064 What do you think about it? Can you prove @spun_off wrong?

    Comment by Petr Skocik (@pskocik) — November 27, 2016 @ 09:06

    • Glad you asked, Petr. There is a subtle difference between the two codes. For the one in doit, calling it with the same object has no defined behavior. This is because then the assignment through *fp changes the effective type to float. The evaluation for printf with *up then reads that float object with an lvalue that has integer type. This is undefined. Spun_off’s example is different. If you look at it carefully you’ll see that the float is only read byte-wise through memcpy, which is allowed. Still the effective type of l2 doesn’t change since it is an auto object. Provided that int has no trap representation (which we can for the sake of the example), the read of l2 is then valid and gives us whatever the bit pattern means when interpreted as an int.

      Comment by Jens Gustedt — November 27, 2016 @ 11:51

      • Originally I thought what you were saying was that it was the function signature alone that implied there could be no aliasing. This clarification makes it clearer because I can actually match it up with the standard (unlike the signature hypothesis). I believe I understand it now (better than gcc authors, I guess :D). Thanks for explaining.

        Comment by Petr Skocik (@pskocik) — November 27, 2016 @ 13:04

        • Yes, unfortunately you cannot deduce such things from the signature. Remember that even a restrict qualification is not part of the signature, either. As for the gcc developpers, don’t pick on them too much, such subtle difference are difficult to describe and even more difficult to test. Basically, I think that anybody who puts such crap in real code doesn’t deserve better.

          Comment by Jens Gustedt — November 27, 2016 @ 13:18

  3. I find it implausible that the authors of the C89 Standard intend that the CIS rule, as exploited by the socket-address code, should only apply when accessing actual union objects.

    The authors of the C Standard openly acknowledge that it is not intended to describe everything a quality implementation must do to be suitable for any particular purpose. Indeed, it doesn’t even describe everything an implementation must do to be suitable for any purpose whatsoever. Instead, the authors figure that if an implementation can run a program which nominally tests resource limits in a manner consistent with the Standard, it will likely be able to process other programs in useful fashion as well. How many useful programs it can process would be a quality-of-implementation, rather than conformance, issue.

    Given that, and along with the facts that:

    (1) It would have been essentially impossible for compilers to honor the Common Initial Sequence guarantee for unions without also guaranteeing it for structure pointers, and

    (2) according to the 1974 C Reference Manual the Common Initial Sequence guarantee applied to CIS member accesses through structure pointers even before unions were added to the language

    (3) struct-pointer CIS access can be used to do useful things that would be impractical with unions

    I would suggest that the CIS rule was almost certainly intended to apply to accesses made via structure pointers.

    Optimizations which assume a programmer won’t do X, will generally be counter-productive in cases where a programmer needs behavior equivalent to X. In most cases where code would need to write a CIS member via type S* and read it via type T*, it would not be difficult to ensure that a complete union type declaration containing S and T is visible at either the point of the write or the read, in many cases code doing the reading will have no reason to know or care about the specifics of all the struct types it might be dealing with. Declaring an object

    struct {int size; short dat[7]} = {7,{1,2,3,4,5,6,7}};
    

    and then being able to pass it to a function that can handle arbitrary types of that style, seems much more practical than requiring that the compilation unit containing that function define named structure types for every possible size of object, and requiring that callers must use those
    particular structure types to define them. It is also much more practical than requiring that a function use memcpy on the received pointer (which would, of course, require a compiler to allow for aliasing of all objects, and not just structures that share a Common Initial Sequence).

    Assuming the authors were honest about their intentions, the purpose of aliasing rules was to avoid forcing implementations to do genuinely-useless work. I see no evidence that they intended to require that it compel programmers to do extra useless work which would in turn compel compilers to do additional useless work. The fact that the Standard may allow implementations to impose such burdensome requirements does not imply that any quality implementations, should.

    Comment by John Payson — May 25, 2017 @ 07:09

    • Thank you for your comment, but I am not sure that I understand where you want to get. As far as I remember, POSIX socket definitions don’t if refer to a CIS rule, but just impose that the field is located at the same offset. Then, it is not very productive to talk about initial intentions (’74, ’89, ’99, ’11 ?) of anybody here. What I was trying to explain, is what guarantees (or not) the C standard gives, and what the current model is. What I described here is what I think would be consensus in the C committee as I know it. There are a lot of more complicated, border cases, where there are perhaps as much opinions as there are people on the committee.

      I don’t share your negative view of these rules, these are not “useless” things that are imposed on you as a programmer. The interplay between different units that are perhaps compiled by different compilers (or with different options) is crucial nowadays, e.g in the presence of LTO or other sophisticated optimizations. So you better know the rules that you may assume.

      And then, designing e.g network code that uses union of different socket struct is not very difficult, you just have to get a bit used to it.. It has the other great advantage that it helps you avoid C’s greatest evil, casts.

      Comment by Jens Gustedt — May 29, 2017 @ 18:41

      • It makes sense to have aliasing rules that don’t require a compiler to bend over backward in situations where a compiler would–even in the absence of such rules–have no particular reason to expect aliasing. Rules become counter-productive, however, when they force programmers to jump through hoops to do things that would otherwise be easy.

        Many behaviors could be cheaply guaranteed by some implementations, but would be expensive for others. Behaviors may also be practically essential in some application fields, but useless in others. The authors of the Standard sought to determine which behaviors will have sufficient value in all fields–even those where they would be the least useful–to justify mandating them on all implementations–even those where they would be most expensive. They did not attempt to define all possible behaviors which be cheap guarantee and would obviously be useful. The focus only on the minimally-justifiable cases would be just fine if the Standard were recognized as being merely a baseline that even the weakest implementations must meet, and not as defining a new set of “rules” which forbid programs from exploiting behaviors that are useful and cheap given their target field and platform, and which implementations for that field and platform had previously supported.

        If the authors of the Standard had been seeking to impose new requirements on programmers, they should have made it practical to adapt existing code to conform to such requirements. Most situations where programs would need to use aliasing match one of two patterns:

        A region of storage needs to have its effective type changed to something else within a particular context, but revert to its earlier type when control leaves that context.
        A region of storage will need to have its effective type changed indefinitely (until the next time it is explicitly changed) or erased.

        If the Standard had defined practical means for achieving those semantics, then code which attempts type punning via other means could reasonably be regarded as obsolescent. I would only regard an approach as “practical”, however, if it satisfies the following criteria:

        Code using the approved means should be usable on any existing compilers that support the equivalent of “-fno-strict-aliasing”, via the use of suitable #define directives [existing compilers may not understand new directives, but when configured for the equivalent of “-fno-strict-aliasing” they could safely ignore them].
        Code using the approved means should be essentially as legible as code which would rely upon “-fno-strict-aliasing”.
        The quality of machine code generated by existing or future compilers from programs using the approved means should be at least as good as what would be generated for code written to exploit “-fno-strict-aliasing”.

        Do you think any of my requirements for something to be considered “practical” are unreasonable? If the Standard doesn’t define any practical means of doing something, but some implementations do support it, and code that relies upon such support follows recognizable patterns, would it be more helpful for implementations to recognize and support those patterns, or compel programmers to jump through hoops to work around the lack of practical support?

        Comment by John Payson — June 4, 2017 @ 20:50

        • I am still not sure where you want to get this, and I have the impression that your perception of the relevance of aliasing in modern compilers does not match reality. Modern compilers are very good in optimizing access to individual fields of structs, e.g, and changing the effective type of a struct can easily subvert this. Having a region of storage change effective type just for some scope/context and fall back to a previous type afterwards is not practical, I think. Basically as soon as you have taken the address of an object, and passed it into a function that is implemented in a different translation unit, you can’t make much of an assumption about the type consistency of the object, unless you have a set of very strict rules that forbid the reinterpretation with a different type.
          And the standard tool to introduce directives are not #define but #pragma or _Pragma. But also I don’t really understand how the directive you are talking about would work other than just implicitly placing restrict qualifications on each pointer argument. So I don’t see really much gain in it. In any case, if you are really convinced of your idea, you should write up a paper about this. Comments to a blog that is just meant to explain the existing rules are not the appropriate means to move such ideas further.

          Comment by Jens Gustedt — June 5, 2017 @ 06:43


RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Create a free website or blog at WordPress.com.