SIMD and dynamic memory allocation

Possible duplicate:
SSE, internals and alignment

I am new to SIMD programming, so please excuse me if I ask an obvious question.

I experimented a bit and got to the point where I want to save the SIMD value in a dynamically distributed structure.

Here is the code:

struct SimdTest { __m128 m_simdVal; void setZero() { __m128 tmp = _mm_setzero_ps(); m_simdVal = tmp; // <<--- CRASH --- } }; TEST( Plane, dynamicallyAllocatedPlane ) { SimdTest* test = new SimdTest(); test->setZero(); delete test; } 

When the method marked by the CRASH comment is executed, the code crashes with the following exception:

 Unhandled exception at 0x775315de in test-core.exe: 0xC0000005: Access violation reading location 0x00000000 

Can someone explain why the assignment operation happens and how the objects containing SIMD should be placed dynamically so that they work normally?

I need to add that if I statically create an instance of the SimdTest object and call the setZero method, everything will be fine.

Thanks Paksas

+4
source share
2 answers

He dies because the structure is incorrectly aligned. This requires only a CRT promises allocator of alignment of 8, 16. You will need to use _aligned_malloc () in MSVC to get correctly aligned memory with a bunch.

Two ways to do this. Since this is a POD structure, you can simply click:

 #include <malloc.h> ... SimdTest* test = (SimdTest*)_aligned_malloc(sizeof SimdTest, 16); test->setZero(); _aligned_free(test); 

Or you can override the new / delete statements for struct:

 struct SimdTest { void* operator new(size_t size) { return _aligned_malloc(size, 16); } void operator delete(void* mem) { return _aligned_free(mem); } // etc.. }; 
+5
source

MSDN states that _m128 is automatically aligned to 16 bytes, not __m128, but _m128. But in any case, I assume the others are right, as I remember that there are two types of move instructions, one for aligned movAps and one for unaligned - movUps. First 16b aligment is required, while others are not. I don't know if the compiler can use both, but I tried this type of _m128.

In fact, there is a special type for this: _M128A.

-1
source

Source: https://habr.com/ru/post/1437635/


All Articles