Compare buffers as fast as possible

I need to compare two buffers for equality. I do not need information on the ratio of two buffers, only if each of the two pieces is equal or not. My Intel machine supports up to SSE4.2

Naive approach:

const size_t CHUNK_SIZE = 16; //128bit for SSE2 integer registers const int ARRAY_SIZE = 200000000; char* array_1 = (char*)_aligned_malloc(ARRAY_SIZE, 16); char* array_2 = (char*)_aligned_malloc(ARRAY_SIZE, 16); for (size_t i = 0; i < ARRAY_SIZE; ) { volatile bool result = memcmp(array_1+i, array_2+i, CHUNK_SIZE); i += CHUNK_SIZE; } 

Compared to my first attempt to use SSE:

 union U { __m128i m; volatile int i[4]; } res; for (size_t i = 0; i < ARRAY_SIZE; ) { __m128i* pa1 = (__m128i*)(array_1+i); __m128i* pa2 = (__m128i*)(array_2+i); res.m = _mm_cmpeq_epi32(*pa1, *pa2); volatile bool result = ( (res.i[0]==0) || (res.i[1]==0) || (res.i[2]==0) || (res.i[3]==0) ); i += CHUNK_SIZE; } 

The gain in speed is about 33%. Can i do better?

+6
source share
2 answers

You really shouldn't use scalar code and unions to test all the individual elements of a vector - do something like this:

 for (size_t i = 0; i < ARRAY_SIZE; i += CHUNK_SIZE) { const __m128i a1 = _mm_load_si128(array_1 + i); const __m128i a2 = _mm_load_si128(array_2 + i); const __m128i vcmp = _mm_cmpeq_epi32(a1, a2); const int vmask = _mm_movemask_epi8(vcmp); const bool result = (vmask == 0xffff); // you probably want to break here if you get a mismatch ??? } 
+4
source

Since you can use SSE 4.1, there is another alternative that could be faster:

 for (size_t i = 0; i < ARRAY_SIZE; i += CHUNK_SIZE;) { __m128i* pa1 = (__m128i*)(array_1+i); __m128i* pa2 = (__m128i*)(array_2+i); __m128i temp = _mm_xor_si128(*pa1, *pa2); bool result = (bool)_mm_testz_si128(temp, temp); } 

_mm_testz_si128(a, b) returns 0 if a & b != 0 and returns 1 if a & b == 0 . The advantage is that you can use this version with the new AVX instructions, where the block size is 32 bytes.

+2
source

Source: https://habr.com/ru/post/889093/


All Articles