C source-to-source compiler enhancement from within

A new research report describing shnell has now appeared on HAL:

https://hal.inria.fr/hal-02998412

We show how locally replaceable code snippets can be used to easily specify and prototype compiler and language enhancements for the C language that work by local source-to-source transformation. A toolbox implements the feature and provides many directives that can be used for compile time configuration and tuning, code unrolling, compile time expression evaluation and program modularization. The tool is also easily extensible by simple filters that can be programmed with any suitable text processing framework.

Advertisement

Modern C, Second Edition


A new edition of the book Modern C and much more are now available under a CC license via the following page

https://gustedt.gitlabpages.inria.fr/modern-c/

 This edition is the result of a collaboration with Manning, and improves a lot over the previous edition; material has been rewritten and reordered, a lot of graphics have been added. Manning has a nicely formatted print and eBook versions of this,  there are links and information how to purchase this through the central link above, even with a limited 50% price reduction. Or a click on the cover may directly take you to Manning.

 

Spurious failures of thread functions

C11’s thread interface was not very clear on failure conditions that some functions might encounter. It was not clear that wait functions for conditional variables (cnd_t) and tentative locking of mutexes (mtx_t) may fail spuriously, that is with not apparent reason for the caller. By lack of such a specification, it was not clear how C11 threads could be realized by POSIX threads, e.g.

Allowing spurious wakeups is particularly important for the wait functions, because it makes implementing the cnd_t type much easier, in particular for the special case that the caller of cnd_signal or cnd_broadcast does not hold the lock on the corresponding mutex. On the other hand, from an application point of view this does not change much. Even without spurious wakeups, a thread that called `cnd_wait`, e.g, must in any case check the real condition they are interested in.

Continue reading “Spurious failures of thread functions”

compare exchange compares memory and not value

In C11, 7.17.7.4, introduces atomic_compare_exchange generic functions. These are precious tools when using atomics: they allow to conditionally store new data in an atomic variable and to retrieve a previous value of it, eventually. You can see that as a generalization of atomic_fetch_and_add where we are also able to retrieve a counter value and change it at the same time.

C11 stated that the value would be taken into account for the conditional part, that is that the existing value would be compared to a desired value. This works well for arithmetic types, where value and object representation are mostly the same. It works less well if the atomic type is a structure because struct types simply have no equality operator.

Continue reading “compare exchange compares memory and not value”

C17: reformulations for atomic_flag

In C11, there was a problem with the fact that atomic_flag is one of the rare types that is not considered to have a value, but it only has state (clear and set) which is changed via function calls. (mtx_t and cnd_t are other examples of such types.) There was no established relation between these states and the return value of the atomic_flag_test_and_set functions (which is bool).

C17 clears that up by prescribing the return values of atomic_flag_test_and_set

Continue reading “C17: reformulations for atomic_flag”

C17: rvalues and function types drop qualifiers

At least since C99 the C standard has:

The properties associated with qualified types are meaningful only for expressions that are lvalues.

This phrase expresses the intent that type qualifiers are to be ignored in “some sense” if they appear in other contexts, namely for the values of expressions, i.e. rvalues.

Until the invention of _Generic in C11, nobody really cared what happened to qualifiers of expressions: only the value and size, not the type itself of an expression was observable from inside a C program.

This changed with C11, _Generic makes types observable. Unfortunately the formulation of the _Generic feature itself did not make it clear if it could be used to also detect the qualifications. C11 leaves the following questions unresolved

    1. Are there contexts where qualified rvalues can pop up?
    2. Can _Generic distinguish a qualified argument from an unqualified one?

In particular the lack of clarification of the second question lead to divergence in the implementation of the _Generic feature in mainline compilers.

In our work for C17, we tackled these questions in two “defect reports”, DR 423 (for a.) and DR 481 (for b.).

DR 423

For the first, there are two context that seemed to allow values of expressions to have qualifiers: casts and function returns. For both it was decided that the intent of the standard never had been to have qualifications here, and that the standard should be fixed to clearly state that intent. Therefore DR 423 proposes two modifications. For casts, it changes the first phrase of 6.5.4.p5 (change in red)

Preceding an expression by a parenthesized type name converts the value of the expression to the unqualified version of the named type.

For function returns, the situation is a bit more complicated, because the current syntax allows return types of functions to be qualified. E.g.

double const foo(void) {
    return 0;
}

Is a valid definition of a function. Whenever it is called, the return value is an rvalue, so is this value a double or a double const?

The answer that was found is to treat the above function as if the const where not present. That is, the return type of the function is adjusted to the unqualified type. The proposed change for 6.7.6.3 p5 is

If, in the declaration “T D1“, D1 has the form
D ( parameter-type-list )
or
D ( identifier-list opt )
and the type specified for ident in the declaration “T D” is “derived-declarator-type-list T“, then the type specified for ident is “derived-declarator-type-list function returning the unqualified version of T“.

Beware that this has more effects than just changing the qualification of value expressions. Compilers that will claim to be compliant to C17 must accept the following:

double (*foop)(void) = foo;
double const (*foopc)(void) = foop;

That is any function type or type derived from that behaves as if the qualifier were not present. Most current compilers will probably see these as a constraint violations. Also, since now types that may have been considered different are now clearly specified to be the same, the following expression is a constraint violation:

_Generic(X,
    double (*)(void): 1,
    double const (*)(void): 2
)

because a generic expression is not supposed to have two entries with compatible types. Most current compilers probably would accept that construct, so they’d have to be fixed.

DR 481

For _Generic it was decided that the intent of the construct was to implement some kind of function overloading for C and that it was meant to have a behavior that is the closest possible to that intended feature. So it was decided to force that the type of a controlling expression is seen as if it would undergo the same conversions as arguments to function calls. The proposed text for 6.5.1.1 reads

The type of the controlling expression is the type of the expression as if it had undergone an lvalue conversionnew), array to pointer conversion, or function to pointer conversion. That type shall be compatible with at most one of the types named in the generic association list.

new) lvalue conversion drops type qualifiers.

 

As a consequence the following code is still valid

_Generic(X,
    double: 1,
    double const: 2
)

only that the second association can never trigger. Therefore it would be reasonable for compilers to give a warning diagnostic for such a generic expression.

This does not mean, though, that qualifiers are not observable at all. You just have to put a little bit more effort into it:

_Generic(&X,
    double*: 1,
    double const*: 2
)

This should still work and both associations may trigger depending on whether or not you pass an lvalue that is const qualified or not.

C17

C17 is a “bugfix release” of the C standard. Whereas the intention of the C working group (WG14) has been that this release does not introduce normative changes (but one), it brings a lot of clarifications all over the place. By adopting this version, some features as implemented by some compilers may change if their interpretation of C11 was different because of an unfortunate ambiguity.

C17 will be superseded by C2x, for which the process of inclusion has begun on the October 2018 of WG14. In particular, a working draft of C2x is now available that is still pretty close to C17:

C2x working draft, post October 2018 meeting

The schedule for C17 was as follows:

  • Nov. 2017, adoption by WG14, subject to some minor, editorial changes
  • Dec. 2017, integration of these changes and approval by an editorial committee
  • Jan. to Mar. 2018, editorial back and forth with ISO, more editorial changes due to new requirements by ISO and their strict enforcement
  • Apr 2018, ISO sends out the FDIS (final draft international standard) to the national bodies.
  • Apr to June 2018, ballot
  • June or July 2018, publication, see C17

I identified the following list of changes in C17 compared to C11. The whole process of clarifications that have been integrated is transparently documented in what we called “defect reports”. So if you urgently need to know about some of these you should look them up, there.

My intention is to write a post on most of the items to explain my POV of what happened.  In particular, I will try to cite the new versions of the changed text for reference. Because of copyright issues, I will only be able to do that once C17 has been published by ISO. So please be patient and stay tuned.

NB: comments are switched off for this post. Please communicate errors or imprecisions that you spotted to me directly. If on the other hand you want to discuss the future (or not) of C, there are a lot of places out there. The best that I know of is WG14 itself. So if you really care, please sign in on your committee of your national standards body for programming languages or alike, and invest yourself in the process.

cross-language interfaces between C and C++

As you know, C and C++ are sister languages that have a lot in common, but that drifted quite apart over the years. In general, neither code of one language can be compiled as the other language, there are too many major and minor twists that make this impossible. Not only are there syntax differences between the two, some common syntax can actually have diverging semantics. So generally, it makes no sense to compile C code with a C++ compiler, and you should look with suspicion at any code or programmer that claims to do so.

Where C and C++ usually agree, though, is on the ABI, the application binary interface, so data structures and functions of one language can be used by the other to some extent. C and C++ also kept a sufficiently wide intersection in there respective specification of interfaces, such that one header file can be used from both.

In this post I try collect those parts that are in that intersection, and I propose some coding style that should accommodate both worlds suitably well. But as my personal history goes, this will merely be a POV of a C programmer that wants to provide interfaces for C++.

Continue reading “cross-language interfaces between C and C++”

gcc doesn’t inline position independent code properly

When compiling position independent code, PIC, the gcc compiler fails to inline many functions that have an external name, that is that are not declared static. While this is conforming to the C standard, this is a complete blow to a lot of optimization efforts that I have put into my code in many places.

Continue reading “gcc doesn’t inline position independent code properly”

Unicode operators for C

C11 has added a certain level of Unicode support to C, but I think for C2x it will be time to go a step further and put C in line with general usage of special characters as they are normalized by Unicode. In particular, it is time to get rid of restrictions in operator naming that stem from the limited availability of special characters 30 years ago, when all of this was invented.

Continue reading “Unicode operators for C”