Detection of unintended weak link characters

In our company, until recently, we did not use namespaces because some compilers could not support them well.

This leads to numerous cases of the following error:

file_A.cpp

class Node { Data *ptr; Node() { ptr = new Data; } ~Node() { delete ptr; } }; 

file_B.cpp

 class Node { vector<int> v; Point *pt; Node(int x,int y) { pt = new Point(x,y); v.push_back(0); } ~Node() { delete pt; } }; void foo() { Node n(10,10); ... } // calls file_B::~Node() !!! 

Each Node author was unaware of the existence of another Node , but since he expected that this class name could be reused, he refrained from creating an .hpp file with it.

The compiler silently deletes one of the destructors, since their signature matches, and the error is difficult to find, because it cannot be replicated on different computers.

As soon as the error was identified, people gradually learned about it, and they will try to seal the definitions in unnamed namespaces or to avoid including member functions in the body of the class [see below].

  • Question 1:. Since you cannot trust that the programmer will always remember that the program is protected, is there a tool that can detect these “unintended weak link characters”?

    Unintentionally, I mean that Node classes have not been defined in .hpp files, and at least one member of the class does not match between class definitions ...

  • Question 2: If we do not use namespaces, but we embed each function, is it likely that the automatically generated functions (copy-ctor, copy-assign, destructor) will create the aforementioned "weak link error"?


Method 1: enclose in unnamed namespaces

 namespace { class Node { Data *ptr; Node() { ptr = new Data; } ~Node() { delete ptr; } }; } 

Method 2: avoid embedding

 class Node { Data *ptr; Node(); ~Node(); }; Node::Node() { ptr = new Data; } Node::~Node() { delete ptr; } 
+4
source share
2 answers

If your code base is large enough to justify the effort, you can customize your existing compiler to solve the problem:

  • LLVM / Clang Compiler is configurable (it is in C ++, and I don't know it very well).
  • The GCC compiler (latest versions, such as 4.6 ) is extensible, either through plugins encoded in C, or through extensions encoded in MELT . MELT is a (free, licensed GPLv3) high-level domain language for the GCC extension.

In both cases, this is an attempt of several days or weeks, and it is most difficult to understand partially the internal representations of the compiler (Gimple and Tree for GCC) and organizations (for example, passes).

I am the main author of MELT, and I will be happy to help you with MELT, so feel free to contact me.

+2
source

"C ++ and Linker" is a very interesting read on this issue. See, in particular, the section 'Rules Without Enforcement Mean Nothing .

One understanding is that you can detect "weak" characters by parsing object files and looking for "W" s:

 $ nm -C foo.o | grep doSomething 00000000 W doSomething() 

That way, you can add a step after a process that automatically collects them and lists duplicates. You can compare them with the main list of alleged duplicates and raise the flag if there are new ones.

Another option might be the gcc -Fno-weak option . It’s not clear from the documents what will happen to the duplicates, but it may be interesting to find out.

The linked article also answers your second question (the “above phenomenon” refers to deleting just one instance of a duplicated weak character):

In some cases, the compiler must create a character, although it embeds a function. This can happen, for example, when a function pointer refers to a function. So, the above phenomenon does not always disappear when optimization is turned on.

+2
source

Source: https://habr.com/ru/post/1383743/


All Articles