I am currently dealing with video processing software in which image data (8 bits, signed and unsigned) is stored in arrays of 16 aligned integers, allocated as
__declspec(align(16)) int *pData = (__declspec(align(16)) int *)_mm_malloc(width*height*sizeof(int),16);
As a rule, will this not allow to speed up reading and writing if you used the so-called char unsigned arrays ?:
__declspec(align(16)) int *pData = (__declspec(align(16)) unsigned char *)_mm_malloc(width*height*sizeof(unsigned char),16);
I don't know much about cache line size and data transfer optimization, but at least I know this is a problem. In addition, SSE will be used in the future, in which case char -arrays - unlike int arrays - are already in packaged format. So which version will be faster?
noira
source share