Two things affect the time difference between nested and flattened arrays:
Caching and Indirect Access
- CPUs use a cache hierarchy to avoid accessing RAM too quickly. This takes advantage of the fact that most memory accesses either touch or have a certain temporal locality, that is, to what access was accessed recently, will be available again.
This means that if the innermost arrays of your nested array are quite large, you will notice slight differences in the flat array if you will iterate over the values in a contiguous manner. This means that when iterating over a range of values for flat arrays, your inner loop should iterate over sequential elements, for nested arrays, your innermost loop should iterate over the innermost array. - If, however, your access patterns are random, the most important timing difference is: For a flat array, you use something like
A[(Z * M + Y) * N + X] , so you do 4 arithmetic operations, and then access to memory.
For a nested array, you use A[Z][Y][X] , so there are actually three interdependent memory accesses: you need to know A[Z] before you can access A[Z][Y] and etc. Due to the superscalar architecture of modern processors, operations that can be performed in parallel are notably efficient, there are not so many interdependent operations. Thus, you have some arithmetic operations and memory loading on the one hand and three interdependent loads on the other hand, which is much slower. It is possible that for nested arrays, the contents of A , as well as A[Z] for some Z values can be found in the cache hierarchy, but if your nested array is large enough, it will never fit completely into the cache, resulting in several cache misses and memory loads (nested) instead of just cache miss and load (flat) for one random access to the array.
Also see his question , especially the shorter answers below for a more detailed discussion of caching (my answer) and indirectness (Peter's answer), which also gives an example where there are no noticeable differences between nested and flat arrays (after fixing the indexing error, of course;) )
So, if you want to know if there are significant differences in execution time between them, I would answer:
If you are doing random access, you will definitely notice a few hints, which will lead to longer execution time for nested arrays.
If you perform continuous access and use the correct order of loops (innermost loop = innermost array / innermost index for a planar array), and your innermost dimension of a multidimensional array is large enough, then the difference will be negligible, since the compiler will be able to move all indirect ones from the innermost cycle.
source share