Jens Gustedt's Blog

February 21, 2012

try, throw and catch clauses with P99

Filed under: C11, C99, language, P99, syntax — Jens Gustedt @ 21:59

Before C11 implementing try/throw/catch clauses with C alone was difficult. C lacked three features for that. First, there was no clear concept of threads. A throw that wants to unwind the stack to a calling function would have to capture the nearest try clause on the stack by inspecting a global variable. If you are running in a threaded environment (which most people do theses days) this global variable would have to hold the state of all current try clauses of all threads. With the new feature of _Thread_local variables (or their emulation through P99_DECLARE_THREAD_LOCAL in P99) this is now easy to solve. We just need a thread local variable that implements a stack of states.

The second feature that is useful for such an implementation are atomic operations. Though we can now implement a thread local variable for the stack of states, we still have to be careful that updates of that variable are not interrupted in the middle by signal handlers or alike. This can be handled with atomic operations, P99 also emulates the _Atomic feature on common platforms.

The third ingredient is _Noreturn. For C11 we can specify that a certain function will never return to the caller. This enables the compiler to emit more efficient code for if a branch in an execution ends with such an noreturn function.

These three interfaces together with other features that had been already present in P99, made it straight forward to implement P99_TRY, P99_THROW, P99_CATCH and P99_FINALLY.

But let’s just start with a warning. Such a thing in C should not implement the catching of typed exception variables. C has no dynamic type system and implementing such a thing would just blow it up. C error codes are int: library functions encode errors through int return values, by setting errno, also an int, or by passing an int to longjmp. The feature implemented here adds a fourth way to return errors, so again it should stick to the existing and propagate an int. In fact, the mechanism that is used under the hood is setjmp/longjmp, so there is also an implementation imposed reason to use int just as everybody else.

The simplest use of this feature is together with P99_FINALLY

 unsigned char*volatile buffer = 0;
 P99_TRY {
   buffer = malloc(bignumber);
   if (!buffer) P99_THROW(thrd_nomem);
   // do something complicated with buffer
   favorite_func(buffer);
 } P99_FINALLY {
   free(buffer);
 }

This will make sure that the buffer allocated in buffer will always be freed, regardless what error conditions the code will meet. In particular this will work, even if an exception is thrown from below the call to favorite_func. If no exception occurs, the P99_FINALLY clause is executed anyhow. Then execution continues after the clause, just as for normal code. If an exception occurs, the clause is executed (and in this case the call to free is issued). But afterwards execution will not continue as normal but jump to the next P99_FINALLY or P99_CATCH block on the call stack.

In the code snippet above the volatile qualification of the variable buffer is important. buffer changes inside the try-block. If we wouldn’t declare it volatile we could just see the value of it as we entered the try-block. The jump to the finally-block that is out of the normal control flow might be effective before a store instruction or just use a cached value of the variable. The volatile qualifier ensures that such optimizations are inhibited and that we always see the value of buffer as it has been stored.

In this example, we could avoid the volatile qualification if we don’t change the variable inside the P99_TRY block:

 unsigned char*const buffer = malloc(bignumber);
 P99_TRY {
   if (!buffer) P99_THROW(thrd_nomem);
   // do something complicated with buffer
   favorite_func(buffer);
 } P99_FINALLY {
   free(buffer);
 }

If buffer is used a lot this would help the optimizer to generate better code.

An alternative way to P99_FINALLY is P99_CATCH and to handle different exceptions explicitly.

 unsigned char*volatile buffer = 0;
 P99_TRY {
   buffer = malloc(bignumber);
   if (!buffer) P99_THROW(thrd_nomem);
   // do something complicated with buffer
 } P99_CATCH(int code) {
   switch(code) {
     case thrd_nomem: perror("we had an allocation error"); break;
     case thrd_timedout: perror("we were timed out"); break;
   }
   free(buffer);
   if (code) P99_RETHROW;
 }

The difference here is that we receive the error code through the variable code and we can thus take different action for different exceptional conditions. If it weren't for the P99_RETHROW, the unrolling of the call stack would stop at this point and execution would continue after the catch block. The exception would be considered to be caught.

Here, since there is a P99_RETHROW, execution will jump to the next P99_FINALLY or P99_CATCH block on the call stack. In fact a catch clause of

 P99_CATCH(int code) {
   // do something here and then
   if (code) P99_RETHROW;
 }

Would be equivalent to

 P99_FINALLY {
   // do something here
 }

only that this wouldn't give access to code.

Note that the code depending on P99_TRY must always be an entire block with { } surrounding it. The code depending on P99_FINALLY or P99_CATCH don't has that restriction, it could just be a single statement.

The definition of the code variable can be omitted. This can be used to catch any exception and to continue execution after the catch clause in any case:

 P99_CATCH() {
  // do some cleanup
 }
 // continue here regardless of what happened

There is also a "catch all" dialect of all this

 P99_TRY {
   // do something complicated that may fail
 } P99_CATCH();

The ; after the catch is just an empty statement. So this catch clause catches all exceptions, forgets the exception code and does nothing.

The raising of an exception is also simple enough. Just use P99_THROW(X) with some error code X. This stops execution at the current point and signals an exception of value X to the next P99_TRY clause that is located on the call stack, if any. If there is no such try clause on the call stack, abort is called.

X should be an integer value that by the usual conversions fits into an int. It must be non-zero. Otherwise the arbitrary value 1 as for setjmp/longjmp is transferred.

A good convention for the values to throw is using system wide error numbers such as thrd_nomem, ERANGE, EINVAL etc. But any other convention that fits the needs of an application can be used. E.g if you would be using the POSIX regular expression library function regcomp a good idea would be to transfer the error codes that this call gives as return values.

Advertisements

9 Comments

  1. unsigned char*volatile buffer = 0;
    P99_TRY {
      buffer = malloc(bignumber);
      if (!buffer) P99_THROW(thrd_nomem);
      // do something complicated with buffer
    } P99_CATCH(int code) {
      switch(code) {
        case thrd_nomem: perror("we had an allocation error"); break;
        case thrd_timedout: perror("we were timed out"); break;
      }
      free(buffer);
      if (code) P99_RETHROW;
    }
    

    Accessing buffer variable inside P99_CATCH block (after the longjmp call) invokes undefined behavior
    “After alongjmp, there is an attempt to access the value of an object of automatic
    storage duration that does not have volatile-qualified type, local to the function
    containing the invocation of the corresponding setjmp macro, that was changed
    between the setjmp invocation and longjmpcall (7.13.2.1).”

    Comment by programmingkills — April 26, 2012 @ 17:38

    • I don’t really understand what you are trying to say.

      The variable of automatic storage duration here is buffer, and that variable is
      volatile qualified.

      Jens

      Comment by Jens Gustedt — April 26, 2012 @ 18:05

      • You’re are right, I just didn’t see the declaration.
        Sorry for the inconvenience.

        Comment by programmingkills — April 26, 2012 @ 18:10

        • Never mind, you raised a good point. I should have mentioned that in the post. Will add a para on volatile somewhere.
          Jens

          Comment by Jens Gustedt — April 26, 2012 @ 18:19

  2. Jens, what’s the point in introducing try/throw/catch clauses to C ? Why not use C++ instead, if you want them?

    Comment by Ivan Pechorin — June 24, 2012 @ 15:13

    • Hello,
      frankly what is the point in your question ?)

      Whenever I want to use a feature X that is similar to something of language Y, you’d tell me to use language Y?
      E.g if I’d want to use lists, you’d tell me to use lisp instead of C?

      More seriously, this is not try and catch from C++:

      • There is no dynamic type system involved.
      • No destructors.
      • This throws int error codes as is the custom in C.

      You can use this to implement something similar to destructors, by doing e.g a free inside the P99_CATCH block. But this doesn’t impose performance penalties on code that doesn’t use this mechanism or isn’t even aware of it.

      Jens

      Comment by Jens Gustedt — June 24, 2012 @ 16:22

      • My question was just out of curiosity: sometime I feel important to stop for a moment and think why I use this instrument, or that instrument…

        Using P99_FINALLY to call free() looks fine, but it requires volatile. It’s not clear if the performance penalty of volatile actually matters here (comparing to the cost of free()). Especially given that we use any decent allocator with thread-specific pools that doesn’t need any locks in malloc/free except for huge blocks of memory. So, it’s not that obvious that the argument “this doesn’t impose performance penalties” is valid.

        Error handling with P99_CATCH (and with C++ try/catch) is more complex and error-prone than standard C style of handling errors. See for instance recent post by author of ZeroMQ “Why should I have written ZeroMQ in C, not C++”: http://www.250bpm.com/blog:4

        Comment by Ivan Pechorin — June 24, 2012 @ 17:10

        • Interesting post, by I don’t think that the criticism applies to a simple setjmp/longjmp based error handling. Adding a new type of “exception” just adds another error code and usually another switchcase.

          And for your objection on efficiency and volatile. If you happen to declare and allocate everything before the P99_TRY, you don’t even have to declare it volatile. Something like

          char *const buffer = malloc(bignumber);
          
          P99_TRY {
            ...
          } P99_FINALLY {
            free(buffer);
          }
          

          will always work. You only need the volatile for things that you change inside the P99_TRY block. I’ll add that to the post to make that clearer.

          Comment by Jens Gustedt — June 24, 2012 @ 17:30

  3. In conjunction with your wrappers on https://gustedt.wordpress.com/2012/07/15/capture-return-codes-from-library-functions/
    I want to refer you to this proposal for exceptions https://www.lyngvig.org/%2FTeknik%2FA-Proposal-for-Exception-Handling-in-C, which is that the carry flag be set on return to signify that an exception is returned (in AX) instead of the normal return value.

    Of course such a method requires compiler support but the advantage over setjmp/longjmp is that no new bugs are introduced by forgetting to set volatile, or performance hits through doing it.

    It is relevant to your work at https://gustedt.wordpress.com/2012/07/15/capture-return-codes-from-library-functions/ because you already wrap functions to transparently check the responses; so although the carry flag would not be checked, you have a mechanism to propagate exceptions by a normal flow return so that register variables will be restored and valid.

    Other ways to signify an exception are that after the function return errno is non-zero AND not what it was before the function was called; also easily checkable in the wrapper macros you produce.

    The remaining mechanism to be devised would be this: is there a way to implement in-function exceptions without using setjmp/longjmp? – using just hidden loops and goto?

    Comment by Sam Liddicott — October 30, 2013 @ 15:09


RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Blog at WordPress.com.