Store __m256 vector sum without AVX-SSE transfer penalty?

Does the following code transition from AVX to SSE? If so, how can I save the sum of the vector __m256 without prejudice to this fine?

__mm256 x_swap = _mm_permute2f128_ps(x,x,1) x = _mm256_add_ps(x, x_swap); x = _mm256_hadd_ps(x,x); x = _mm256_hadd_ps(x,x); // now all fields of x contain the sum float sum; _mm_store_ss(&sum, _mm256_castps256_ps128(x)); 

Thanks.

+4
source share
1 answer

As long as you compile your code with -mavx , you should not see any AVX-SSE transition sanctions. When compiling with -mavx you automatically use the new non-destructive SSE code, and there are no penalties when mixing them with AVX instructions. Fines only occur when mixing legacy SSE instructions with AVX, and usually only with assembly code or when mixing modules that were compiled with different flags.

+5
source

Source: https://habr.com/ru/post/1501453/


All Articles