It depends on what you mean by typecast exactly, but if you are looking for a narrowing operation, you can use _mm_packs_epi32 ( PACKSSDW ) to pack two whole vectors into one short vector:
__m128i vint1, vint2; // 2 vectors of 4 x 32 bit ints __m128i vshort; // 1 vector of 8 x 16 bit ints vshort = _mm_packs_epi32 (vint1, vint2);
Reversible, expanding (unpacking) operation can be performed as follows:
vint1 = _mm_srai_epi32(_mm_unpacklo_epi16(vshort, vshort), 16); // PUNPCKLWD+PSRAD vint2 = _mm_srai_epi32(_mm_unpackhi_epi16(vshort, vshort), 16); // PUNPCKHWD+PSRAD
Note that when using SSE unpack instructions there is no automatic character expansion, so the need for arithmetic shift when expanding character values ββexpands.
source share