I have code written using MSVC SSE built-in functions.
__m128 zero = _mm_setzero_ps(); __m128 center = _mm_load_ps(&sphere.origin.x); __m128 boxmin = _mm_load_ps(&rhs.BottomLeftClosest.x); __m128 boxmax = _mm_load_ps(&rhs.TopRightFurthest.x); __m128 e = _mm_add_ps(_mm_max_ps(_mm_sub_ps(boxmin, center), zero), _mm_max_ps(_mm_sub_ps(center, boxmax), zero)); e = _mm_mul_ps(e, e); __declspec(align(16)) float arr[4]; _mm_store_ps(arr, e); float r = sphere.radius; return (arr[0] + arr[1] + arr[2] <= r * r);
The type Math::Vector
(which is the type sphere.origin
, rhs.BottomLeftClosest
and rhs.TopRightFurthest
), is actually an array of 3 floats. I have aligned them 16 bytes and this code works fine on x64. But on x86, I get an access violation by reading a null pointer. Any tips on where this comes from?
Puppy source share