Easy structure inheritance and pseudo-polymorphism versus strict anti-aliasing

If someone answers my question, please don't tell me to use C ++.

So, I am making a small C library that uses an object oriented approach. I decided to use the less common of the two main approaches to inheritance in C: copying members of the base type to the beginning of the derived type. Something like that:

struct base { int a; int b; char c; }; struct derived { int a; int b; char c; unsigned int d; void (*virtual_method)(int, char); }; 

This approach is less popular than the other (an instance of the base type as the first element of the derived type), because

  • technically, there is no standard guarantee that the first common base members and derivatives will have the same biases. However, unless one of the structures is packaged and the other does not, they will have the same offsets for most, if not all, known compilers.
  • This approach has the most serious flaw: it violates a strict alias. Passing a pointer to a derived structure in its base type, and then dereferencing the pointer is technically undefined.

However, it also has its advantages over another approach:

  • Less verbosity: access to the member of the inherited derived structure is similar to access to the one that was not inherited, instead of giving the base type and then accessing the desired member;
  • This is actually real inheritance, not composition;
  • This is as easy to implement as another approach, although a slight abuse of the preprocessor may be required;
  • We can get a semi-baked form of actual multiple inheritance, where we can inherit several basic types, but we can use only one of them.

I was looking for compilation and proper operation of my library with compilers that provide strict anti-aliasing (e.g. gcc ), without having to manually disable it. Here are the features I learned:

  • Unions This, unfortunately, is no-no for several reasons:

    • Credibility is back! In order to follow the standard rules for access to the first common members of 2 structures through the union, it is necessary (with C99) to explicitly use the union to access the first common members. We need special syntax to access members of each type in a union!
    • Space Consider the hierarchy of inheritance. We have a type that we want to use for each of its derived types. And we want to do this for each type. The only possible unionization solution that I see is a union of the entire hierarchy that should be used to convert instances of a derived type to a base type. And it should be as large as the most derived type in the entire hierarchy ...
  • Using memcpy instead of direct dereferencing (e.g. here ). This seems like a nice solution. However, calling a function carries overhead, and yes, again, verbosity. As I understand it, the fact that memcpy can also be done manually by dropping the struct pointer to a char pointer and then dereferencing it, something like this: (member_type)(*((char*)(&struct_pointer->member))) = new_value; Gah, verbosity again. Well, this can be wrapped with a macro. But will it work if we throw our pointer to a pointer to an incompatible type and then throw it on char* and cast it? (member_type)(*((char*)(&((struct incompatible_type*)struct_pointer)->member))) = new_value; this: (member_type)(*((char*)(&((struct incompatible_type*)struct_pointer)->member))) = new_value;

  • Declaring all instances of types that we will use as volatile . I wonder why this often does not occur. volatile , as I understand it, is used to tell the compiler that the memory pointed to by the pointer may change unexpectedly, thereby canceling the optimization based on the assumption that the segment of pointed memory will not change, which is the reason for all the problems with hard anti-aliasing. This, of course, is undefined behavior; but could this not be an acceptable cross-platform solution for hacking disabling strict alias optimization for certain instances of certain types?

In addition to the above issues, here are two more:

  • Did I say something wrong above?
  • Am I missing something that might help in my case?
+6
source share
2 answers

I don't think your idea of ​​casting via char* valid. Rule:

An object must have a stored value, accessible only with the value of an lvalue expression, which has one of the following types

The subexpression of your expression is compatible, but the general expression is incompatible.

I think the only realistic approach is composition:

 struct base { int a; int b; char c; void (*virtual_method)(base*/*this*/,int, char); }; struct derived { struct base; unsigned int d; }; 

I understand that an intellectually unattractive way to achieve inheritance.

PS: I did not put your virtual element function pointer in my derived class. It must be accessible from base , so it must be declared (assuming that this polymorphic function exists for both base and derived ). I also added the this parameter to display the model by touch.

+4
source

memcpy should be the way to go. Don’t worry about function calls. More often than not, no. memcpy usually a built-in compiler, which means that the compiler needs to install the most efficient code possible for it, and it needs to know where it can optimize memcpies.

Do not point to incompatible pointers, and then look for them. This is the path to undefined behavior.

If you accept expression expressions and gcc ##__VA_ARGS__ , you can have an MC_base_method(BaseType,BaseMethod,Derived_ptr,...) macro MC_base_method(BaseType,BaseMethod,Derived_ptr,...) that correctly calls BaseMethod with Derived_ptr and ... if you can work with a copy of the structure as if it was the original (for example, did not indicate its own structural elements).

Here is an example with some additional sugar macro supporting OOP:

 //Helper macros for some C++-like OOP in plain C #define MC_t_alias(Alias, ...) typedef __VA_ARGS__ Alias //like C++ using #define Struct(Nm,...) MC_t_alias(Nm, struct Nm); struct Nm __VA_ARGS__ //autypedefed structs #define ro const //readonly -- I don't like the word const //Helper macros for method declarations following my //Type__method(Type* X, ...) naming convention #define MC_mro(Tp,Meth, ...) Tp##__##Meth(Tp ro*X, ##__VA_ARGS__) #include <stdio.h> #include <string.h> //I apend my data structs with _d to know they're data structs Struct(base_d, { int a; int b; char c; }); Struct(derived_d, { int a; int b; char c; unsigned int d; void (*virtual_method)(derived_d*, int, char); }); //print method is unaware of derived_d //it takes a `base_d const *X` (the mro (method, readonly) macros hides that argument (X==`this` in common OOP speak)) int MC_mro(base_d,print) { return printf("{ a=%db=%dc=%d }", X->a, X->b, X->c); } /* Call a (nonvirtual) base method */ #define MC_base_method(BaseType, Method, Derived_p, ...) \ ({ \ int _r; /*if you conventionally return ints*/ \ /*otherwise you'll need __typeof__ to get the type*/ \ BaseType _b; \ memcpy(&_b, Derived_p, sizeof(_b)); \ _r = BaseType##__##Method(&_b, ##__VA_ARGS__); \ /*sync back -- for non-readonly methods */ \ /*a smart compiler might be able to get rid of this for ro method calls*/ \ memcpy(Derived_p, &_b, sizeof(_b)); \ _r; \ }) int main() { derived_d d = {1,2,3,4}; MC_base_method(base_d, print, &d); } 

I consider this a task for compilers to optimize memcpies. However, if this is not the case, and your structures are huge, you are screwed. The same thing if your structures contain pointers to their own members (i.e. if you cannot work with a byte on a byte copy, as if it were the original).

+1
source

Source: https://habr.com/ru/post/981099/


All Articles