Is it possible to vectorize multiplication in VC ++ without SSE4?

I want to vectorize the multiplication operation. I tried using _mm_mul_epi32 , but my processor only supports MMX instruction, SSE (1,2,3,3S), EM64T.

Can someone tell me if I can try another function?

+4
source share
1 answer

It depends on the range of your multisets - they fit within 16 bits, and then up to SSE4 there are several 16-bit 16-bit SSE instructions (e.g. mm_madd_epi16 , mm_mulhi_epi16 , mm_mullo_epi16 , mm_mulhrs_epi16 , etc.).

If you need 32-bit operands but they are not defined, you can use mm_mul_epu32 .

Alternatively, you can convert to float and use _mm_mul_ps (integer ↔ float conversion to SSE is quite efficient, and the cost can be justified if you get 4x SIMD bandwidth).

+4
source

Source: https://habr.com/ru/post/1345463/


All Articles