An option with support for OpenMP 4?

I have a question about a new aligned option in OpenMP. This is in the context of using it with #pragma omp simd aligned(a:n)

Let's say I have an array of integers that I allocated using posix_memalign, so I know that the array starts with let, say, a 32-byte border. Now let's say that I want each value in this array to be square. May i say ...

 int* array = { some array of length len aligned to 32 bytes }; #pragma omp simd aligned(array:32) for(int i = 0; i < len; i++) array[i] *= array[i]; 

Is this a safe guess? Or does alignment also mean that the data type of the size that I use (int) in the array is a multiple of 32 bytes? As if an attribute ((aligned (32)) in gcc will make the width type at least 32 bytes.

+5
source share
2 answers

To make sure we understand each other, suppose your array actually corresponds to 256 bits (which is equivalent to 32-byte alignment).

Then yes, your #pragma omp simd aligned(array:32) is safe, regardless of array length or array type size. The only thing that matters is the address pointed to by the "pointer" used to refer to the array.


EDIT : I realized that my answer, although correct, was a bit dry, as it was only me, answering, but without any "official" support. So, here are some excerpts of the standard to sustain my answer:

From the OpenMP 4.0 standard §2.8.1 :

[ C / C ++ : aligned sentence declares that the object to which each point in the list of items is aligned with the number of bytes expressed as an optional parameter of the aligned sentence.]

The optional parameter of the alignment sentence, alignment, must be a constant positive integer expression. If an optional parameter is missing, realistic default alignments for SIMD instructions on target platforms.

[...]

[ C : the type of list items displayed in the aligned condition must be an array or a pointer.]

[ C ++ : the type of list items displayed in an aligned condition must be an array, a pointer, an array reference, or a pointer reference.]

As you can see, there are no assumptions about the data type referenced or referenced by the variable used within the aligned clause. The only assumption is that the address of the allocated memory segment is byte-aligned with an optional parameter or some “realistic default alignments by default” (which BTW strongly recommends that I always give this optional parameter, since I have no idea that this is the default value set by the implementation may be more accurate whether I will be sure that my array is indeed aligned that way).

+3
source

aligned(ptr:n) tells the compiler that the ptr array begins with an address aligned to n bytes. This helps the compiler decide how to optimally vectorize the loop. Since many vector units require vector loads and stores to be aligned, if the compiler cannot deduce the alignment of the data at compile time, it must generate runtime code that checks the alignment and ultimately executes the unexpressed parts of the loop (as at the beginning and end of the iteration spaces) using scalar instructions. These checks are time consuming, especially with smaller array sizes. If the correct alignment is known at compile time, the compiler can directly emit the necessary scalar operations. With low loads and storage, the AVX-512 (Intel Xeon Phi) runs using masking and ensures proper alignment allows the compiler to directly emit masked commands as needed rather than calculate masks at runtime.

+3
source

Source: https://habr.com/ru/post/1233157/


All Articles