Aliases caused by dereferencing violate strict anti-aliasing rules

I used the following code snippet to read data from files as part of a larger program.

double data_read(FILE *stream,int code) { char data[8]; switch(code) { case 0x08: return (unsigned char)fgetc(stream); case 0x09: return (signed char)fgetc(stream); case 0x0b: data[1] = fgetc(stream); data[0] = fgetc(stream); return *(short*)data; case 0x0c: for(int i=3;i>=0;i--) data[i] = fgetc(stream); return *(int*)data; case 0x0d: for(int i=3;i>=0;i--) data[i] = fgetc(stream); return *(float*)data; case 0x0e: for(int i=7;i>=0;i--) data[i] = fgetc(stream); return *(double*)data; } die("data read failed"); return 1; } 

Now they tell me to use -O2 and I get the following gcc warning: dereferencing type-punned pointer will break strict-aliasing rules

Google, I found two orthogonal answers:

against

  • So basically, if you have int * and float *, they are not allowed to point to the same place in memory. If your code does not comply with this, the compiler optimizer is likely to break your code.

In the end, I do not want to ignore the warnings. What would you suggest?

[update] I replaced the toy example with a real function.

+45
optimization with gcc pointers strict-aliasing
Jul 14 '10 at 12:48
source share
7 answers

It looks like you really want to use fread:

 int data; fread(&data, sizeof(data), 1, stream); 

However, if you want to go the way of reading characters and then reinterpreting them as int, the safe way to do this in C (but not in C ++) is to use union:

 union { char theChars[4]; int theInt; } myunion; for(int i=0; i<4; i++) myunion.theChars[i] = fgetc(stream); return myunion.theInt; 

I'm not sure why the length of data in your source code is 3. I assume you need 4 bytes; at least I don't know any systems where int is 3 bytes.

Please note that both your code and mine are very non-portable.

Edit: If you want to read ints of various lengths from a file, portable, try something like this:

 unsigned result=0; for(int i=0; i<4; i++) result = (result << 8) | fgetc(stream); 

(Note: in a real program, you will also want to check the return value of fgetc () for EOF.)

It reads 4-byte unsigned from a little-endian file, regardless of what the consistency of the system is. It should work with almost any system where unsigned has at least 4 bytes.

If you want to be neutral with respect to the end, do not use pointers or unions; use bit shifts instead.

+26
Jul 14 '10 at 13:01
source share

The problem arises because you are accessing char -array via double* :

 char data[8]; ... return *(double*)data; 

But gcc assumes that your program will never access variables, although pointers are of different types. This assumption is called strict smoothing and allows the compiler to make some optimizations:

If the compiler knows that your *(double*) no way overlaps with data[] , it allows all kinds of things, such as reordering your code:

 return *(double*)data; for(int i=7;i>=0;i--) data[i] = fgetc(stream); 

Livestock is most likely optimized and you get simply:

 return *(double*)data; 

Which leaves your data [] uninitialized. In this particular case, the compiler may see that your pointers overlap, but if you declared it char* data , it might give errors.

But the strict anti-aliasing rule states that char * and void * can point to any type. Therefore, you can rewrite it to:

 double data; ... *(((char*)&data) + i) = fgetc(stream); ... return data; 

Strong anti-aliasing warnings are really important to understand or correct. They cause errors that cannot be reproduced internally, because they occur only on one particular compiler in one particular operating system on one particular machine and only on the full moon and once a year, etc.

+39
Oct 12
source share

Using a union is not what you need to do here. Reading from an unwritten union member is undefined - that is, the compiler is free to perform optimizations that will violate your code (for example, write optimizations).

+7
Dec 22 2018-10-22
source share

This document summarizes the situation: http://dbp-consulting.com/tutorials/StrictAliasing.html

There are several different solutions, but the most portable / safe is to use memcpy (). (Function calls can be optimized, so it is not as inefficient as it seems.) For example, replace this:

 return *(short*)data; 

With this:

 short temp; memcpy(&temp, data, sizeof(temp)); return temp; 
+6
Apr 28 '16 at 16:42 on
source share

Basically, you can read the gcc message as the guy you are looking for, don't say that I did not warn you.

Passing a three-byte character array into an int is one of the worst things I've seen ever. Usually your int has at least 4 bytes. So for the fourth (and possibly bigger if int wider) you get random data. And then you dropped it all to double .

Just don't do it. The smoothing problem gcc warns about is innocent compared to what you do.

+2
Jul 14 '10 at 14:11
source share

The authors of the C-standard wanted the authors of the compiler to generate efficient code in circumstances where it would be theoretically possible, but it is unlikely that a global variable could have access to this value using a seemingly unbound pointer. The idea was not to prohibit punning by casting and dereferencing a pointer in a single expression, but rather say something like:

 int x; int foo(double *d) { x++; *d=1234; return x; } 

the compiler will have the right to assume that writing to * d will not affect x. The authors of the Standard wanted to list situations in which a function like the one above, that got a pointer from an unknown source, would have to assume that it could be an alias of apparently unrelated global, without requiring these types to fit perfectly. Unfortunately, although the rationale strongly suggests that the authors of the Standard plan to describe the standard for minimal compliance in cases where the compiler would otherwise have no reason to believe that everything could be an alias, the rule does not require compilers to recognize aliases in those cases when this is obvious, and the gcc authors decided that they would rather generate the smallest program that it can, while it matches the poorly written Standard language than generates really useful and instead of recognizing aliases in cases where this is obvious (although they can still assume that things that are not like aliases will not), they would prefer that programmers use memcpy , which requires the compiler to be able to to indicate pointers of unknown origin, there can be an alias of almost any that prevents optimization.

0
Apr 13 '16 at 21:04
source share

Apparently, the standard allows sizeof (char *) to differ from sizeof (int *), so gcc complains when trying to broadcast live. void * is a little special in that everything can be converted back and forth to and from void *. In practice, I don’t know much architecture / compiler where the pointer is not always the same for all types, but gcc is right to issue a warning, even if it is annoying.

I think a safe way would be

 int i, *p = &i; char *q = (char*)&p[0]; 

or

 char *q = (char*)(void*)p; 

You can also try this and see what you get:

 char *q = reinterpret_cast<char*>(p); 
-four
Aug 16 '10 at 8:25
source share



All Articles