language – Jens Gustedt's Blog

The deprecated attribute in C23 does much more than marking obsolescence

You may already have heard that C23 proposes a new syntax feature called attributes and that one of the standard attributes in C23 is called deprecated. Actually, we got this whole attribute thing and also some of the standard attributes from C++, where they had been introduced in C++11 (I think). One of the uses of this particular attribute is to mark a given feature as obsolescent. For example in C23 we now have

[[deprecated]] char* asctime(const struct tm* timeptr);
[[deprecated]] char* ctime(const time_t* timer);

This simply says that user code should not use these functions anymore, and that compilers should issue a diagnostic if they do.

But this is only one of the possible uses of this new feature. First of all, it can be placed on very different kinds of features, functions, types, variables, enumeration members, structure members …

But most importantly it can be put on features that you somehow have to expose in a header, but which are really only part of the internal dealings of your code, something that users just shouldn’t touch because the feature needs careful maintenance or that simply might change in the future.

In the C23 edition of Modern C, see this post, I just modified a detailed example to use this new feature:

typedef struct circular circular;
struct circular {
  size_t  start [[deprecated("privat")]]; /* First element     */
  size_t  len   [[deprecated("privat")]]; /* Number of elements*/
  size_t  cap   [[deprecated("privat")]]; /* Maximum capacity  */
  double* tab   [[deprecated("privat")]]; /* Data array        */
};

What you see here, is a structure type that can be used by application code: variables can be declared, the size is known. But none of the members of the structure can be accessed directly without being warned.

Early access to the C23 edition of Modern C

Manning’s early access program (MEAP) for the new edition is now open
at

https://www.manning.com/books/modern-c-third-edition

There is a special code mlgustedt2 to get 45% off of the official price. This is currently limited until Jan 31.

The previous edition already has been largely successful and is considered by some as one of the reference books on C. This new edition has been the occasion to overhaul the presentation in many places, but its main purpose is the update to the new C standard, C23. The goal is to publish this new edition of Modern C at the same time as the new C standard goes through the procedure of ISO publication and as new releases of major compilers will implement all the new features that it brings.

Among the most noticeable changes and additions that we handle are those for integers: there are new bit-precise types coined _BitInt(N), new C library headers (for arithmetic with overflow check) and (for bit manipulation), possibilities for 128 bit types on modern architectures, and substantial improvements for enumeration types. Other new concepts in C23 include a nullptr constant and its underlying type, syntactic annotation with attributes, more tools for type generic programming such as type inference with auto and typeof, default initialization with {}, even for variable length arrays, and constexpr for name constants of any type. Furthermore, new material has been added, discussing compound expressions and lambdas, so-called “internationalization”, a comprehensive approach for program failure.

During MEAP we will also add an appendix and continue work on a temporary include header for an easy transition to C23 on existing platforms, that will allow you to start off with C23 right away.

We also encourage you to post any questions or comments you have about the content in the liveBook Discussion forum. We appreciate knowing where we can make improvements and increase your understanding of the material.

Note that the “about this book” section in MEAP is a bit hidden under the front page. I think that reading this even before going in details is important so I repeat some of this information here.

C17 obsoletes ATOMIC_VAR_INIT

C17 is only a “bugfix” release of the C standard, with one exception, the changes concerning ATOMIC_VAR_INIT are a normative.

Continue reading “C17 obsoletes ATOMIC_VAR_INIT”

Modern C is now feature complete

You find the latest version of my book at the usual place at

http://icube-icps.unistra.fr/index.php/File:ModernC.pdf

https://modernc.gforge.inria.fr

https://gustedt.gitlabpages.inria.fr/modern-c/

This now contains all the material that I plan to put into it. New are “experience” parts about performance, function-like macros and _Generic, control flow, threads and atomics. There is in particular a section about memory consistency, probably the most difficult write-up for this book.

These higher level parts have not yet have had the reading they hopefully deserve, so please give it a try. As before, constructive feedback is highly welcome. Thanks to everybody who already did send me comments and corrections!

Constructive comments to this blog are welcome.

This is not the right place for

feedback about details (missing commas, spelling errors). If you have such details, please collect them and send them to me directly.
rants against the C standard. This is not the place to let off steam nor to compare to whatever other programming language you prefer. If you have criticism about the C standard, please contribute. The C standards committee is a nice batch of people, don’t be afraid. And if you have concrete proposals, also it would be much better to contact me (or any other member of the committee) directly and see how you may invest yourself in the ISO process.

Also, comments that come from real people are highly preferred.

The controlling expression of _Generic

Recently it showed that the C standard seems to be ambiguous on how to interpret the controlling expression of _Generic, the one that determines the choice. Compiler implementors have given different answers to this question; we will see below that there is code that is interpreted quite differently by different existing compilers. None of them is “wrong” at a first view, so this tells us that we must be careful when we use _Generic. In this post I will try to explain the problem and to give you some work around for common cases.

Continue reading “The controlling expression of _Generic”

Demystify undefined behavior

In discussions about C or other standards (e.g POSIX) you may often have come across the term undefined behavior, UB for short. Unfortunately this term is sometimes mystified and even used to scare people unnecessarily, claiming that arbitrary things even outside the scope of your program can happen if a program “encounters undefined behavior”. (And I probably contributed to this at some point.)

First, this is just crappy language. How can something “encounter” behavior? It can’t. What is simple meant by such slang is that such a program has no defined behavior, or stated yet otherwise, the standard doesn’t define what to do if such a program reaches the state in question. The C standard defines undefined behavior

behavior, upon use of a nonportable or erroneous program construct or of erroneous data,
for which this International Standard imposes no requirements

That’s it.

Simple examples of undefined behavior in C are out-of-bounds access to arrays or overflow of signed integer types. More complicated cases arise when violating aliasing rules or when access to data races between different threads.

Now we often hear statements like “UB may format your hard drive” or “UB may steal all your money from your bank account” implying somehow that a program that is completely unrelated to my disk administration or to my online banking, could by some mysterious force have these evil effects. This is (almost) complete nonsense.

If UB in a simple program of yours formats your hard drive, you shouldn’t blame your program. No simple application run by an unprivileged user should have such devastating consequences, this would be completely inappropriate. If such things happen, it is your system which is at fault, get you another one.

As an analogy from every-day life, take the idea of locking your house at night, which seems to be the rule in some societies. Sure, if you don’t do that, you make it easier for somebody to sneak into your house and shoot you. But still, that person would be a murderer, to be punished for that by the law, no sane legal system would acquit such a person or see this as mitigating circumstances.

Now things are in fact different when there is a direct relation between the object of your negligence and the ill-effect that it causes. If you leave your purse on the table in the pub when going to the bathroom, you should take a share in responsibility if it is stolen. Or to come back to the programming context, if you are programming a security sensible application (e.g handling passwords or bank credentials) then you should be extremely careful and stay on well-defined grounds.

When programming in C there are different kinds of constructs or data for which the program behavior then is undefined.

Behavior can be explicitly declared undefined or implicitly left undefined.
The error can be detectable or undetectable at compile time.
The error can be detectable or undetectable during execution.
Behavior of certain constructs can be undefined by a specific standard to allow extensions.

Unfortunately, people often use UB as an excuse to evacuate questions about certain behavior of their code (or compiler). This is just rude behavior by itself, and you usually shouldn’t accept this. An implementation should have reasons and policies how it deals with UB, otherwise it is just bad, and you should switch to something more friendly, if you may.

There is no reason not to be friendly, and there are many to be.

That is, in our context, once a (your!) code detects that it has a problem it must handle it. Possible strategies are

abort the compilation or the running program
return an error code
report the problem
define and document the behavior in your own terms

For the first you should always have in mind that the program should be debugable, so you really should use #error or static_assert for compile time errors and assert and abort for run time errors. Also, willingly aborting the program is not the same as having the program crash, see below.

Obviously, the second is only possible if the function in question has an established error return convention and if you can expect users of the function to check for that convention. POSIX has many such cases where the documentation says something like “may” return a certain error code. A C (or POSIX) library implementation that detects such a “may” case and doesn’t react accordingly, is of bad quality.

Reporting detected errors is an important alternative and some compiler implementors have chosen this as their default answer to problems. Perhaps you already have seen gcc’s famous “diagnostic” message

dereferencing pointer ‘bla’ does break strict-aliasing rules

This message supposes that you know what aliasing rules are, and that you also know why it came to that. Observe that it says “does” so the compiler claims to have proven that the aliasing rules are violated, and that the behavior is thus undefined. What the message doesn’t tell, and that is bad, is how it resolves the problem.

The last variant, defining your own stuff, should be handled with extreme care. Not only that what you define must be sensible, you also make a commitment in doing so. You promise your clients that you will follow these new rules in the future and that they may suppose that you will take care of the underlying problem. In most cases, you should leave the definition of extensions to the big players, platform designers or other standards. E.g POSIX defines a lot of cases that are UB for C as such.

As a last alternative, when

the error is undetectable
the detection of the error would be too expensive

you should simply let the program crash. The best strategy I know of is to always initialize all variables and always treat 0 as a special value. This may, under rare circumstance deal a tiny bit of performance against security, because on some rare occasions your compiler might not be able to optimize an unused initialization. If you do this, most errors that you will see are dereferences of null pointers.

Dereferencing a null pointer has UB. But modern architecture no how to handle this: they raise a “segmentation fault” error and terminate the program. This is the nicest failure path that you can offer to your clients, failing anytime, anywhere.

Modern C: Level 2 available

I am pleased to announce the feature completion of Level 2 of my book

Modern C

It deals with most principal concepts and features of the C
programming language, such as control structures, data types,
operators and functions. Its knowledge should be sufficient for an
introductory course to Algorithms, with the noticeable particularity
that pointers aren’t fully introduced here, yet.

As before, the current version of the book can be found at my homepage

Click to access File:ModernC.pdf

and also as before, constructive feedback is highly welcome. Many
thanks to those that already gave such valuable feedback for previous
version.

Don’t use casts

I recently reviewed some document on security recommendations where I was baffled by the fact that the code examples were sprinkled with casts all over the place. I had thought that people that are concerned with software security in mind would adhere to one of the most important rules in C programming:

Casts are harmful and evil.

The “evil” here is to be read as reference to black magic. Most uses of cast are merely done in the spirit of “casting a spell” by people that try to quieten their compiler. The sorcerer’s apprentice approach: if I don’t see the evil, it isn’t there.

For me it is evident that every cast punches a hole in C’s type system. So, concerned with code security, we should avoid them as much as possible. But this evidence doesn’t yet seem shared (meaning that it is not so evident 🙂 and I decided to explain things here in more detail.

Casts (explicit conversions) in C come with three different flavors, depending on the cast-to and cast-from type

pointer to pointer
pointer to integer or vice versa
integer to integer

Continue reading “Don’t use casts”

testing compile time constness and null pointers with C11’s _Generic

Sometimes in C it is useful to distinguish if an expression is an “integral constant expression” or a “null pointer constant”. E.g for an object that is allocated statically, only such expressions are valid initializers. Usually we are able to determine that directly when writing an initializer, but if we want to initialize a more complicated struct with a function like initializer macro, with earlier versions of C we have the choice:

Use a compiler extension such as gcc’s __builtin_constant_p
We’d have to write two different versions of such a macro, one for static allocation and one for automatic.

In the following I will explain how to achieve such a goal with C11’s _Generic feature. I am not aware of a C++ feature that provides the same possibilities. Also, this uses the ternary operator (notably different in C and C++), so readers that merely come from that community should read the following with precaution.

Continue reading “testing compile time constness and null pointers with C11’s _Generic”

a praise of size_t and other unsigned types

Again I had a discussion with someone from a C++ background who claimed that one should use signed integer types where possible, and who also claimed that the unsignedness of size_t is merely a historical accident and would never be defined as such nowadays. I strongly disagree with that, so I decided to write this up, for once.

What I write here will only work with C, and can possibly extended to C++ and other languages that implement unsigned integer types, e.g good old Pascal had a cardinal type.

Continue reading “a praise of size_t and other unsigned types”

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: