Jens Gustedt's Blog

December 18, 2013

right angle brackets: shifting semantics

Filed under: C++, rants — Jens Gustedt @ 23:29

As I showed int this post, using > as right angle brackets was not a particularly good idea, but trying to patch this misdesign even makes it worth. After a bit of experimenting I found an expression that is in fact valid for both, C++98 and C++11, but that has a different interpretation in both languages:

fon< fun< 1 >>::three >::two >::one

So if you have to maintain a large code base with templates that depend on integers that are perhaps produced automatically by some tools, be happy, you will not be out of work for a while: changing your compiler to C++11 might change the semantics of your code.

(more…)

December 15, 2013

A disimprovement observed from the outside: right angle brackets

Filed under: C++, rants — Jens Gustedt @ 23:16

It is long time that I didn’t look into C++, I have to admit. By coincidence I recently unearthed a hilarious example that I had once written that shows the difficulty of parsing some C++ code, as well as for compilers as for us poor humans. It all starts with the >> operator that (supposedly until C++11) could cause problems as in the following:

toto< tutu< 3 >> A;

Here the >> is (was) interpreted as `right shift’ operator and thus this code would create a compile time error. C++11 changed this by introducing the possibility that in that case the right-shift-operator-token closes the two template angle brackets. The argument is that shift operators in template arguments are rare (which is probably true) and so this sacrifices some valid uses of that operator for the sake of causing less brain damage to C++ newbies.

(more…)

April 20, 2012

struct tags in C++ are even weirder

Filed under: C++, C11, C99 — Jens Gustedt @ 22:14

I already discussed that fact that struct tags are not identifiers in C++, and in particular that a tag name can be used as the name of a types. Today I learned that the rules for that are even more complicated than I thought. In C an identifier that has been used as a tag (for struct, union or enum) can freely be used as another identifier (variable, typedef, label). In C++ only almost: there is one sort of identifiers it can’t be used for, typedef unless it refers to the same type as the tag.

(more…)

February 15, 2012

surprising occurrence of identifiers in header files

Filed under: C++, C11, C99, language, preprocessor — Jens Gustedt @ 17:35

I remember being stuck sometime ago because a system header at the time on the platform that I was using defined the undocumented identifier barrier. IIRC this even was a macro, which made the bug really hard to track, seemingly harmless code simply exploded.

Hopefully nowadays platform implementors are a bit more careful in not polluting the namespace, but still avoiding naming conflicts is not so easy. E.g inline functions are a useful tool when you want to expose small functions to all compilation units of a program. There is one pitfall, though, when it comes to naming conventions for their parameter names and local variables. If you get the name wrong, as in this simple example

inline double my_sin(double PHI) { return sinf(PHI); } 

other users of your code might encounter random problems if they define a macro PHI.
(more…)

September 12, 2010

struct tags are not identifiers in C++

Filed under: C++, C99 — Jens Gustedt @ 14:05

It seems a common mistake to think that a declaration like struct toto { ... }; in C++ implies the definition of the identifier toto as a type. In reality the rule for this is much more subtle than that: it only implies some sort of implicit typedef struct toto toto;. When and if in the corresponding scope there is no other identifier of the same name the toto refers to the struct.

This comes e.g in effect when you try to use the tools from “sys/stat.h” in C++. It defines a function stat and a struct stat that coexist in the same scope.

This kind of implicit definition is a pitfall when you think of code sharing between C and C++. In the following we will consider four codes that are slight variations of the same idea.

/* Compiles in C and C++, output will usually differ for both*/
#include <stdio.h>
static char T = 'a';
int main(int argc, char** argv) {
    struct T { char X[2]; };
    printf("size of T is %zu\n", sizeof(T));
}

Here the implicit typedef in C++ comes to its full beauty: for C++ the sizeof operator refers to the type and not to the variable. Thus the output in C will be 1 (this is a char variable, not a character literal) and in C++ it will at least be 2.

/* Compiles in C and C++, output will be 1 for both*/
#include <stdio.h>
int main(int argc, char** argv) {
    static char T = 'a';
    struct T { char X[2]; };
    printf("size of T is %zu\n", sizeof(T));
}

In this example, the variable T in the function scope inhibits the lookup of T as a struct tag . So the sizeof operator will refer to the variable in both languages. Since sizeof(char) is always 1 in both cases, this is what will always be printed.

/* Compiles in C but not in C++ */
#include <stdio.h>
static char T = 'a';
int main(int argc, char** argv) {
    struct T { char X[2]; };
    printf("size of T is %zu\n", sizeof T);
}

Here T will be interpreted differently by C and C++ as in the first example. Since the keyword sizeof is only valid as a prefix expression before another expression and not in front of a type, this is invalid code in C++.

/* Compiles in C and C++, output will be 1 for both*/
#include <stdio.h>
int main(int argc, char** argv) {
    static char T = 'a';
    struct T { char X[2]; };
    printf("size of T is %zu\n", sizeof T);
}

This last example is equivalent to the second, only that omitting the parenthesis in the sizeof expression ensures that T is not taken as a type, here.

Things get even worse. If you define an object with the same name later in the code, the output changes:

/* Compiles in C and C++, output will usually differ for both*/
#include <stdio.h>
static char T = 'a';
int main(int argc, char** argv) {
  struct T { char X[2]; };
  printf("size of T is %zu\n", sizeof(T));
  static char T = 'a';
  printf("size of T is %zu\n", sizeof(T));
}

In C++ this prints two different values.

This answer on stackoverflow may give you further insight into this question.

July 18, 2010

Keyword overloading: the static keyword

Filed under: C++, C99, syntax — Jens Gustedt @ 07:46

The keyword static has seen a wider and wider use in the different versions of C and C++. For the use that it has now, compared to the beginnings of C, it would have been better to use another token for it something in the vein of variant or alternative_declaration… Here I try to list all the different usages of that keyword and how the introduction of static in a declaration chooses an alternative version of such a declaration.

Linkage specification

In C and C++ unless specified otherwise an object that is defined in file scope, i.e outside of any function, has static storage and external linkage. That is, space for the object is reserved at compile time and it is visible by the linker. Other code of other compilation units (.o files) may refer to such an object by its name.

The linkage changes when a declaration is prefixed with static: the object becomes internal to the compilation unit and is not accessible from other units. Different objects in different units with the same name that are declared static may co-exist even if they have different type and size.

C++ deprecates this use of static and provides the concept of an anonymous namespace as a replacement.

Storage class specification

In C and C++ unless specified otherwise a variable that is defined in function scope has automatic storage: at run time, when the function is called, memory for the variable is reserved on the execution stack. Each call of the function, in particular if is recursive, has its own new version of the variable.

The storage class changes when a declaration is prefixed with static: the object becomes static. Space for the variable is reserved at compile time and all calls of the function see the same variable, read and write into the same storage location.

Class member declaration

In C and C++ each instance of a struct (and in C++ also for a class) has its own copy of each of its declared members.

struct toto { double a; };
struct toto A;
struct toto B;

Here A.a and B.a are guaranteed to be two different objects with different location in storage

In C++ this changes when the declaration of the member is prefixed with static

struct toti { static double a; };
struct toti A;
struct toti B;

Here A.a and B.a are guaranteed to be the same object. Both refer to exactly the same storage location.

Arrays as function parameters

As one of the rough edges of the C language(s), two declaration that look exactly the same (but for the names), one in the prototype and one as a stack variable, result in the declaration of two different types of variables.

void foo(double A[10]) {
   double B[10];    
}

Inside the scope of foo, A is pointer to double and B is array of ten elements of type double. Even their sizes computed with sizeof are different.
C++ inherited this rule.

C99 complicates this matter even further by introducing the keyword static to introduce yet another variant

void foo(double A[static 10]) {
   double B[10];    
}

this doesn't change the rules on how A and B are seen from the inside, but provides an information to the caller side of how much array elements are expected. C99 specifies that it is responsibility of the caller to give at least as many elements as the array bound specifies.

For the moment gcc accepts this new syntax and simply ignores this information, so it is not C99 conforming with respect to that feature.

June 23, 2010

Obfuscation or inventing a new operator tends to operator -->

Filed under: C++, C99, syntax — Jens Gustedt @ 09:21

I found a really nice one in this discussion here. Basically the idea is that you may format operators a bit to show this code, which is valid C99 and C++:

for (unsigned x = 10; x --> 0;)
     printf("%u\n", x);

Ain’t that cute?

Now I was thinking that in C++ we could really obfuscate that much better by inventing some helper class. I came up with the following class Heron that `converges’, (written as aHeron --> eps) towards the square root of the initial value.

Enjoy.

#include <iostream>
#include <math.h>

using std::cout;
using std::endl;

struct Heron {
  double const a;
  double x;
  Heron(double _a) : a(_a), x(_a) { }
  operator double(void){ return x; }
};

class Heron_tmp;

inline
Heron_tmp operator--(Heron& h, int);

class Heron_tmp {
  friend class Heron;
  friend Heron_tmp operator--(Heron& h, int);
private:
  Heron* here;
  Heron_tmp(Heron& h) : here(&h) { }
public:
  inline int operator>(double err) const;
};

inline
Heron_tmp operator--(Heron& here, int) {
  return here;
}

inline
int Heron_tmp::operator>(double err) const {
  double& x = here->x;
  double const& a = here->a;
  x = (x + a/x) * 0.5;
  return fabs((x*x - a)/a) > err;
}

int main(void) {
  Heron aHeron(2.0);
  while (aHeron --> 1E-15)
    cout << (double)aHeron << endl;
}

Blog at WordPress.com.