Is it possible to distinguish a C-structure from another with fewer elements?

I'm trying to do OOP in C (just for fun) and I came up with a data abstraction method, having a structure with a public part and a big structure with a private part, and then the private part. Thus, I create the entire structure in the constructor and return it to the small structure. Is this right or it may not work?

Here is an example:

#include <stdio.h> #include <stdlib.h> #include <string.h> // PUBLIC PART (header) typedef struct string_public { void (*print)( struct string_public * ); } *string; string string_class_constructor( const char *s ); void string_class_destructor( string s ); struct { string (*new)( const char * ); void (*delete)( string ); } string_class = { string_class_constructor, string_class_destructor }; // TEST PROGRAM ---------------------------------------------------------------- int main() { string s = string_class.new( "Hello" ); s->print( s ); string_class.delete( s ); s = NULL; return 0; } //------------------------------------------------------------------------------ // PRIVATE PART typedef struct string_private { // Public part void (*print)( string ); // Private part char *stringData; } string_private; void print( string s ) { string_private *sp = (string_private *)( s ); puts( sp->stringData ); } string string_class_constructor( const char *s ) { string_private *obj = malloc( sizeof( string_private ) ); obj->stringData = malloc( strlen( s ) + 1 ); strcpy( obj->stringData, s ); obj->print = print; return (string)( obj ); } void string_class_destructor( string s ) { string_private *sp = (string_private *)( s ); free( sp->stringData ); free( sp ); } 
+6
source share
7 answers

In theory, this may be unsafe. Two separately declared structures may have different internal mechanisms, since there is absolutely no positive requirement for their compatibility. In practice, the compiler is unlikely to be able to generate different structures for two identical lists of elements (unless somewhere there exists an implementation-specific annotation in which points are not included, but you know about it).

The traditional solution is to use the fact that a pointer to any given structure is always guaranteed to be the same as a pointer to this first element of the structure (that is, structures do not have an initial fill: C11, 6.7.2.1. 15). This means that you can force the leading elements of the two structures to be not only the same, but also strictly compatible, using the structure of values โ€‹โ€‹of a general type in the leading position for both of them:

 struct shared { int a, b, c; }; struct foo { struct shared base; int d, e, f; }; struct Bar { struct shared base; int x, y, z; }; void work_on_shared(struct shared * s) { /**/ } //... struct Foo * f = //... struct Bar * b = //... work_on_shared((struct shared *)f); work_on_shared((struct shared *)b); 

This is perfectly compatible and guaranteed to work, because packing common elements into one leading structure means that only the position of the leading element Foo or Bar explicitly used.


In practice, alignment is unlikely to be a problem that will bite you. A much more pressing issue is aliasing (i.e., the compiler is allowed to treat pointers to incompatible types as not aliases). A pointer to a structure is always compatible with a pointer to one of its member types, so a common basic strategy will not give you any problems; the use of types that the compiler should not indicate as compatible can lead to the fact that in some cases it will emit incorrectly optimized code, which can be a very complicated Heisenbaug method if you do not know about it.

+7
source

Here I would do if you really intend to hide the definition of string_private.

First, you must extern structure containing the class definition, or it will be duplicated in each translation unit that declares the title. Move it to the 'c' file. Otherwise, very few changes to the public interface.

string_class.h:

 #ifndef STRING_CLASS_H #define STRING_CLASS_H // PUBLIC PART (header) typedef struct string_public { void (*print)( struct string_public * ); } *string; string string_class_constructor( const char *s ); void string_class_destructor( string s ); typedef struct { string (*new)( const char * ); void (*delete)( string ); } string_class_def; extern string_class_def string_class; #endif 

In the string_class source, declare a private structure type that does not appear outside the translation unit. Make the public type a member of this structure. The constructor will allocate a private struct object, but will return a pointer to a public object contained within. Use offsetof magic to move from public to private.

string_class.c:

 #include <stdio.h> #include <stdlib.h> #include <string.h> #include <stddef.h> #include "string_class.h" typedef struct string_private { void (*print)( string ); char *string; struct string_public public; } string_private; string_class_def string_class = { string_class_constructor, string_class_destructor }; void print( string s ) { /* this ugly cast is where the "Magic" happens. Basically, it converts the string into a char pointer so subtraction will work on byte boundaries. Then subtracts the offset of public from the start of string_private to back up to a pointer to the private object. "offsetof" should be in <stddef.h>*/ string_private *sp = (string_private *)( (char*) s - offsetof(struct string_private, public)); // Private part puts( sp->string ); } string string_class_constructor( const char *s ) { string_private *obj = malloc( sizeof( string_private ) ); obj->string = malloc( strlen( s ) + 1 ); strcpy( obj->string, s ); obj->public.print = print; return (string)( &obj->public ); } void string_class_destructor( string s ) { string_private *sp = (string_private *)( (char*) s - offsetof(struct string_private, public)); free( sp->string ); free( sp ); } 

Usage does not change ...

main.c:

 #include <stdlib.h> // just for NULL #include "string_class.h" // TEST PROGRAM ---------------------------------------------------------------- int main() { string s = string_class.new( "Hello" ); s->print( s ); string_class.delete( s ); s = NULL; return 0; } //------------------------------------------------------------------------------ 
+1
source

Well, that might work, but it's not a very safe way to do something. In fact, you are simply trying to โ€œhide access to the private data of the object by starting the structure. The data still exists, it is simply impossible to get it semantically. The problem with this approach is that you need to know exactly how the compiler arranges the bytes in the structure, or you will get different results from the cast. From memory this is not defined in the C specification (someone can fix me about this).

A better way would be to simply prefix the private_ private properties or something like that. If you really want to limit the scope, then create a static local data array inside the .c file of the class file and add a private data structure to them each time you create a new object. Essentially, you then store personal data inside the C module and use the rules for defining the domain c to provide you with your private access protection, although this is really a lot of work for nothing.

Also your OO design is a bit confusing. A string class is really a factory string object that creates string objects, and it would be clearer if you single out these two things.

+1
source

C does not guarantee that it will work, but overall it does. In particular, C explicitly leaves most aspects of representing struct values โ€‹โ€‹undefined (C99 6.2.6.1), including whether the representation of the values โ€‹โ€‹of your smaller struct is the same as the location of the corresponding starting elements larger than struct .

If you need an approach that guarantees C guarantees, give your subclass a member of its superclass type (rather than a pointer to one). For instance,

 typedef struct string_private { struct string_public parent; char *string; } string_private; 

This requires a different syntax to access the "inherited" members, but you can be absolutely sure that ...

 string_private *my_string; /* ... initialize my_string ... */ function_with_string_parameter((string) my_string); 

... works (if you have typedef ed "string" as struct string_public * ). Moreover, you can even avoid such throws:

 function_with_string_parameter(&my_string->parent); 

How useful any of these questions is, however, is a completely different matter. Using object-oriented programming is not in itself an end in itself. OO is a tool for organizing your code that has some notable advantages, but you can write in OO style without mimicking the specific syntax of any particular OO language.

+1
source

In most cases, this is normal with an initial sequence of any length, since all known compilers will give the common members of two struct the same addition. If they had not given them the same addition, they would have had hellish time following this C standard requirement:

To simplify the use of joins, there is one special guarantee: if the union contains several structures that have a common initial sequence, and if the union object currently contains one of these structures, it is allowed to check the common initial parts of any of them.

I really canโ€™t imagine how the compiler will deal with this if the "initial sequence" differs differently in the two struct s.

But there is one serious "but." strict overlay must be disabled for this setting to work.

Strict anti-aliasing is a rule that basically states that two pointers of incompatible types cannot refer to the same memory location. Therefore, if you draw a pointer to your larger struct pointer to a smaller one (or vice versa), get the value of the member in their initial sequence by dereferencing one of them, and then change this value through the other, and then check it again from the first pointer, it will not be changed. I.e:.

 struct smaller_struct { int memb1; int memb2; } struct larger_struct { int memb1; int memb2; int additional_memb; } /* ... */ struct larger_struct l_struct, *p_l_struct; struct smaller_struct *p_s_struct; p_l_struct = &l_struct; p_s_struct = (struct smaller_struct *)p_l_struct; p_l_struct->memb1 = 1; printf("%d", p_l_struct->memb1); /* Outputs 1 */ p_s_struct->memb1 = 2; printf("%d", p_l_struct->memb1); /* Should output 1 with strict-aliasing enabled and 2 without strict-aliasing enabled */ 

You see that a compiler that uses optimizations using strict anti-aliasing (for example, GCC in -O3 mode) wants to make life easier for itself: it believes that two pointers of incompatible types simply cannot refer to the same memory location, therefore, they do not believe that they are doing this. Thus, when accessing p_s_struct->memb1 he will think that nothing has ever changed the value of p_s_struct->memb1 (which, as you know, will be 1 ), so it will not check the actual value of memb1 and simply output 1 .

A way around this may be to declare your pointers as pointing to volatile data (which means telling the compiler that this data can be changed from other sources without notifying it), but the standard does not guarantee that it will work.

Note that all of the above applies to struct , which are not compiled in a special way.

+1
source

Casting from one struct to another is unreliable because the types are incompatible. What you can rely on is that if the first elements of the parent structure are at the top of the child structure and in the same order, then reinterpret cast will let you do what you want. For instance:

 struct parent { int data; char *more_data; }; struct child { int data; char *more_data; double even_more_data; }; int main() { struct child c = {0}; struct parent p1 = (struct parent) c; /* bad */ struct parent p2 = *(struct parent *) &c; /* good */ } 

This is the same as python that implements object-oriented programming at the C level.

0
source

If I remember correctly, this type of casting is undefined by standard. But GCC and MS C guarantee that this will work as you think.

So for example:

 struct small_header { char[5] ident; uint32_t header_size; } struct bigger_header { char[5] ident; uint32_t header_size; uint32_t important_number; } 

You can push them back and forth and safely access the first two members. Of course, if you have a small one and drop it to the big one, access the important_number member to get the UB.

Edit:

This guy is doing a good article about this:

The punning type is not funny: using pointers to redo to C is bad.

0
source

Source: https://habr.com/ru/post/983607/


All Articles