Segmentation error using OpenMp and SSE

I'm just starting to experiment with adding OpenMP to some SSE code.

My first test program SOMETIMES crashes in _mm_set_ps, but it works when I set if (0).

It looks so simple, I have to miss something obvious. I am compiling with gcc -fopenmp -g -march = core2 -pthreads

#include <stdio.h> #include <stdlib.h> #include <immintrin.h> int main() { #pragma omp parallel if (1) { #pragma omp sections { #pragma omp section { __m128 x1 = _mm_set_ps ( 1.1f, 2.1f, 3.1f, 4.1f ); } #pragma omp section { __m128 x2 = _mm_set_ps ( 1.2f, 2.2f, 3.2f, 4.2f ); } } // end omp sections } //end omp parallel return 0; } 
+6
source share
2 answers

This is an error in the implementation of openMP. I had the same issue in gcc on Windows (MinGW). -mstackrealign command line option resolved my issue. This adds instructions to the prolog of each function to align the stack with a 16-byte boundary. I did not notice any performance penalty. You can also add __attribute__ ((force_align_arg_pointer)) to a function declaration, which should do the same, but only for a specific function. You may need to put the SSE code in a separate function, which you then call from the function using #pragma omp so that the stack can rebuild.

I had a problem when I switched to compilation for a 64-bit target (MinGW64, for example

+6
source

I feel unsatisfied access to memory. Its the only way that such code could explode (provided that it is the only code). XMM registers will not be used for this, but rather a memory stack that is aligned with only 4 bytes, I think the omp code messed up the stack alignment.

+2
source

Source: https://habr.com/ru/post/892890/


All Articles