The standard does not impose any restrictions on what implementations should do when a program tries to use an array index out of bounds in one structure field to access a member of another. Thus, access outside borders is “illegal” in strictly appropriate programs, and programs that use such calls cannot be 100% portable and error free at the same time. On the other hand, many implementations determine the behavior of such code, and programs that focus exclusively on such implementations can use this behavior.
There are three problems with this code:
While many implementations expose structures in a predictable way, the standard allows implementations to add arbitrary padding before any member of the structure other than the first. The code can use sizeof or offsetof to ensure that members of the structure are positioned as expected, but the other two problems remain.
Given something like:
if (structPtr->array1[x]) structPtr->array2[y]++; return structPtr->array1[x];
it would usually be useful for the compiler to suggest that using structPtr->array1[x] will result in the same value as the previous use in the if condition, even if it changes the behavior of the code, which depends on smoothing between the two arrays.
If array1[] has, for example, 4 elements, the compiler gave something like:
if (x < 4) foo(x); structPtr->array1[x]=1;
may conclude that since there would be no specific cases where x not less than 4, he could unconditionally call foo(x) .
Unfortunately, while programs can use sizeof or offsetof to ensure that there are no surprises in the structure of the structure, there is no way by which they can check whether compilers agree to refrain from optimizing types # 2 or # 3. In addition, the Standard is a bit vague about what will mean in case, for example:
struct foo {char array1[4],array2[4]; }; int test(struct foo *p, int i, int x, int y, int z) { if (p->array2[x]) { ((char*)p)[x]++; ((char*)(p->array1))[y]++; p->array1[z]++; } return p->array2[x]; }
It’s pretty clear in the standard that the behavior will be determined only if z is in the range 0..3, but since the type p-> of the array in this expression is char * (due to decay), this is not clear the access cast using y will have any effect. On the other hand, since converting the pointer to the first element of the structure to char* should give the same result as converting the pointer to the structure to char* , and the converted pointer to the structure must be accessible to access all bytes, it seems that access using x should be defined for (at least) x = 0..7 [if the offset of array2 greater than 4, this will affect the x value needed to get into the members of array2 , but some value of x can do this with a certain behavior].
IMHO, a good tool would be to define an index operator on array types so that it does not include pointer decomposition. In this case, the expressions p->array[x] and &(p->array1[x]) may suggest to the compiler that x is 0..3, but p->array+x and *(p->array+x) will require compiler capabilities for other values. I don't know if any compilers do, but the standard does not require this.
supercat Nov 04 '17 at 16:23 2017-11-04 16:23
source share