2D array indexing - undefined behavior?

Question

2D array indexing - undefined behavior?

I recently got some code snippets doing some dubious indexing operations on 2D arrays. As an example, consider the following code example:

int a[5][5]; a[0][20] = 3; a[-2][15] = 4; a[5][-3] = 5;

Are the indexing operations above subject to undefined behavior?

+4

c arrays undefined-behavior multidimensional-array

dragosht Aug 05 '14 at 13:04 on

source share

3 answers

Yes, this behavior is undefined.

-one

Morgan Wilde Aug 05 '14 at 13:05

source share

indexing an array with negative indices is undefined behavior. Unfortunately, a[-3] matches *(&a - 3) in most architectures / compilers and is accepted without warning, but the C language allows you to add negative integers to pointers, but not use negative values as array indices. This curse is not even verified at run time.

In addition, there are some problems you need to know when defining arrays before pointers. You can leave unspecified only the first subindex, and no more, as in:

 int a[][3][2]; /* array of unspecified size, definition is alias of int (*a)[3][2]; */

(indeed, this is a pointer definition, not an array, just type sizeof a )

or

int a [4] [3] [2]; / * an array of 24 integers, size 24 * sizeof (int) * /

when you do this, the way to estimate the offset is different for arrays than for pointers, so be careful. In the case of arrays int a[I][J][K];

 &a[i][j][k]

fits in

 &a + i*(sizeof(int)*J*K) + j*(sizeof(int)*K) + k*(sizeof(int))

but when you declare

 int ***a;

then a[i][j][k] coincides with:

*(*(*(&a+i)+j)+k) , which means you must dereference a pointer, then add (sizeof(int **))*i to its value, then search again, then add (sizeof (int *))*j to this value, then search for it and add (sizeof(int))*k to this to get the exact data address.

BR

-one

Luis Colorado Aug 6 '14 at 11:48

source share

Drew McGowen · Accepted Answer · 2014-08-05 13:33

This behavior is undefined, and here's why.

Access to a multidimensional array can be divided into a series of one-dimensional array accesses. In other words, the expression a[i][j] can be represented as (a[i])[j] . Citation C11 §6.5.2.1 / 2:

The definition of the index operator [] is that E1[E2] identical (*((E1)+(E2))) .

This means that the above is identical to *(*(a + i) + j) . Following C11 §6.5.6 / 8 regarding the addition of an integer and a pointer (emphasis mine):

If the operand pointer and the result point to elements of the same array object or one after the last element of the array object, the estimate should not lead to overflow; otherwise, the behavior is undefined.

In other words, if a[i] not a valid index, the behavior occurs immediately undefined, even if the "intuitive" a[i][j] is represented within the boundaries.

So, in the first case, a[0] valid, but the next [20] not, because the type of a[0] is int[5] . Therefore, the index 20 goes beyond.

In the second case, a[-1] already beyond, thus, already UB.

In the latter case, however, the expression a[5] points to one past of the last element of the array, which is true in accordance with § 6.5.5 / 8:

... if expression P points to the last element of an array object, expression (P)+1 points one after the last element of an array object ...

However, later in the same paragraph:

If the result points to the last element of an array object, it should not be used as the operand of the unary * operator that is being evaluated.

So, although a[5] is a valid pointer, dereferencing it will lead to undefined behavior caused by the final indexing [-3] (which is also out of bounds, therefore UB).

2D array indexing - undefined behavior?

More articles: