White space does matter in C23

Usually in C identifiers are not directly followed by strings. But when U prefixed literals were introduced in C. there still were some rare clashes with existing code. This happened were a macro U that expanded to a string was used to add some sort of leading character sequence to a string. Prior, this usage was not sensible to whether or not there was a space between the two. By introducing the prefix the two usages (with and without space) became distinct and code changed its meaning or became invalid. So for this situation, space is in fact already significant.

Generally, it is often assumed that in C spaces don’t contribute much to the interpretation of programming text, but for C23 this has become an over-simplification that does not reflect the situation anymore.

In addition to problems internal to C, there is a problem of interfacing with C++, where some of the rules are different.

syntax meaning, C23 different meaning, C++
# define X(A) function like macro, empty
# define X (A) object macro, expands to (A)
0x4'7'a hex number with digit separators
0x4 '7'a number, character literal, and identifier number, character literal with suffix
0x4 '7' a number, character literal, and identifier
"%" PRIx64 valid format string for printf
"%"PRIx64 valid format string for printf string literal with suffix
R "(hör)" identifier followed by multi-byte string
R"(hör)" identifier followed by multi-byte string raw multi-byte string, contains just hör
R "hör" identifier followed by multi-byte string
R"hör" identifier followed by multi-byte string invalid raw string
U "hör" identifier, followed by multi-byte string
U"hör" UTF-32 string

It would be good if there could be some coordination here between C and C++ about such simple questions concerning lexing and the preprocessor.

Any use of identifiers that are adjacent to character and string literals should be discouraged; I think that such constructs should just be excluded by the syntax. If we want this to be diagnosed it should be before compilation phase 4, in particular before macro expansion. Best would be if this is diagnosed in phase 3, lexing.

Compilers and preprocessor implementations could start to warn about such possible collision immediately, as of today. I don’t see much of a reason to still allow it. Even where there are valid uses, such as for the printf format string macros, the code becomes much easier to read if these are separated from the surrounding format string.

The deprecated attribute in C23 does much more than marking obsolescence

You may already have heard that C23 proposes a new syntax feature called attributes and that one of the standard attributes in C23 is called deprecated. Actually, we got this whole attribute thing and also some of the standard attributes from C++, where they had been introduced in C++11 (I think). One of the uses of this particular attribute is to mark a given feature as obsolescent. For example in C23 we now have

[[deprecated]] char* asctime(const struct tm* timeptr);
[[deprecated]] char* ctime(const time_t* timer);

This simply says that user code should not use these functions anymore, and that compilers should issue a diagnostic if they do.

But this is only one of the possible uses of this new feature. First of all, it can be placed on very different kinds of features, functions, types, variables, enumeration members, structure members …

But most importantly it can be put on features that you somehow have to expose in a header, but which are really only part of the internal dealings of your code, something that users just shouldn’t touch because the feature needs careful maintenance or that simply might change in the future.

In the C23 edition of Modern C, see this post, I just modified a detailed example to use this new feature:

typedef struct circular circular;
struct circular {
  size_t  start [[deprecated("privat")]]; /* First element     */
  size_t  len   [[deprecated("privat")]]; /* Number of elements*/
  size_t  cap   [[deprecated("privat")]]; /* Maximum capacity  */
  double* tab   [[deprecated("privat")]]; /* Data array        */
};

What you see here, is a structure type that can be used by application code: variables can be declared, the size is known. But none of the members of the structure can be accessed directly without being warned.

Continue reading “The deprecated attribute in C23 does much more than marking obsolescence”