Two related questions.
This is what my code should do with a fairly large amount of data. This is done inside the inner loops, and performance is important.
- Convert and __int32 array to double (or convert __m128i to two __m128d).
- Convert and array floats to double (or convert __m128 to two __m128d).
Basically, I need a function with the following signatures:
void convert_int_to_double(__int32 const * input, double * output);
void convert_float_to_double(float const * input, double * output);
The input and output pointers are aligned, and the number of elements is a multiple of 4. The main problem is how to quickly unzip __m128 into two __m128d.
source
share