Access to memory with cache in dense physics and collision

I write a physics engine and can hardly figure out a good way to develop my data warehouse.

Functionality I want:

  • A class representing a physical interface
  • Take a class representing the volume of collisions (say, field)
  • Each physical body can have another collision to it.
  • It is possible to have a physical body without collision volume.
  • Optional: CollisionVolume without a physical body. (think trigger volumes)

Now I have basically two loops. One that updates physical bodies in a simulation. It updates their position / speed / rotation. The second cycle performs collision detection on all collision volumes. Its just a nested loop that checks for collisions between each pair of collision volumes. (I know that this can be done better, but this is a separate issue)

I know that the ideal way is to store objects in adjacent arrays.

std::vector<PhysicsBody> m_bodies; std::vector<CollisionVolume> m_colliders; 

Problems I found with this approach:

  • It is difficult to maintain PhysBody β†’ CollisionVolume relationships. For example, if I want to remove CollisionVolume from my vector, I would replace it with the last one and pop back. The data moves, and if I saved the index in CollisionVolume in PhysicsBody, it will not be more reliable.
  • At any time when I destroy PhysicalBody, the destructor will check if there is a collision attached to it, and accordingly remove it from the Physical system. The problem is that the vector will make internal copies and destroy them, and when this happens, it will lead to chaos, removing the currents of conflicts that should not have been deleted.
  • CollisionVolume is actually a base class (optional), and other classes are extracted from it as boxes / spheres and what not. I could not use inheritance and come up with some other complex design, but this must be remembered.

I tried to find a way around this, but instead left pointers:

 std::vector<PhysicsBody*> m_bodies; std::vector<CollisionVolume*> m_colliders; 

The best solution to minimize cache gaps that I encountered was to overload new ones / delete and save these objects in the memory pool only for the physical system.

Are there any other better solutions? Obviously, performance is key.

+6
source share
3 answers

The main question: in the absence of threads that execute and modify data from different cores (CPUs), where do you see the need to take care of the cost of cache memory?

The cache coherence protocol is only triggered when the line becomes dirty on a kernel other than the reader’s core or vice versa.

Sounds like you really meant cache locality? Is it correct?

With Coherence Vs. the terrain is aloof, here is my trick:

The moment you go to the vector, you have lost direct control over the management of the terrain. You can return some of them using the memory pool. However, you will have to deal with the movement associated with the resize operations.

Do you know the number of items up? If so, you can do it.

 vector<T> myVec; myVec.reserve(NUM_ELEMS); 

followed by a new location for each object from an adjacent memory area.

 myvec[i] = ... 

The memory for the vector and elements can completely come from one pool. This can be achieved by passing it to the user allocator when instantiating std :: vector. See the following:

+2
source

An easy way to avoid losing memory and blocking memory blocks together for a better cache hit is to simply save the physics elements that you deleted in the vector to avoid index invalidation, but mark them removed so that you can restore these free spaces on subsequent inserts.

Here, if you want to go the full path of creating your own container, it really helps if you still want to refer to the destructors on these remote objects to understand how STL containers are implemented using "post new" and manual calls to the destructor to avoid the need for things like an assignment operator for type T.

You can additionally direct them to the list of vacant indexes, which will be corrected whenever you insert a new one or even faster, treat one of these elements as a union of a list pointer, for example:

 union PhysicsNode { PhysicsBody body; PhysicsNode* next_free; }; PhysicsNode* free_physics_nodes; 

The same goes for collision node. In this case, you consider this physical class as physical when it is β€œbusy”, and as a separate node list when it is β€œfreed” or β€œfree” to be fixed.

Unfortunately, when you try to solve a problem at this level, you often come across object-oriented constructors, constructors and destructors, etc.

Therefore, when you want such efficiency together with all the object-oriented software advantages, when you may encounter the problem of solving this problem, basically the same thing at the level of the memory allocator.

0
source

I would like to point out a significant difference in the data model, whether you plan to use a vector engine (GPU) for calculation or not. A typical data layout follows OOP and is called an array of structures. To use any vector engine, the data must be reorganized into Structure-of-Arrays. More on Intel

https://software.intel.com/en-us/articles/creating-a-particle-system-with-streaming-simd-extensions

0
source

Source: https://habr.com/ru/post/986270/


All Articles