Memory access and memory copy

I am writing a C ++ application that needs to be read multiple times from the same memory from many threads. My point of view in terms of performance would be better to copy the memory for each thread or to give all threads the same pointer and they will all access the same memory.

thanks

+6
source share
2 answers

There is no final answer from a little information about your target system, etc., but on a regular PC, most likely it will not be the fastest.

One of the reasons copying can be slow is because it can lead to cache misses if the data area is large. A typical PC will cache read-only access to the same data area very efficiently between threads, even if these threads run on different cores.

One of the advantages explicitly listed by Intel for their caching approach is "Allows more data sharing capabilities for threads running on separate cores that share a common cache . " That is, they encourage practice, when you do not need to program threads to explicitly cache data, the CPU will do it for you.

+6
source

Since you specifically mention many threads, I assume that you have at least a multi-network system. Typically, memory banks are associated with processor sockets. That is, one of the processors is โ€œclosestโ€ to its own memory banks and must interact with other memopry memory controllers to access data in other banks. (The processor here means the physical thing in the socket)

When distributing data, the first record policy is usually used to determine which memory banks your data will be allocated to, which means that it can access it faster than other processors.

So, at least for several processors (and not just for several cores) there should be a performance improvement when distributing a copy for at least each processor. It is necessary to select / copy data with each processor / thread, and not from the main thread (to use the first-write policy). You also need to make sure that threads will not migrate between processors, because then you are likely to lose a tight connection to your memory.

I'm not sure how copying data for each thread on a single processor will affect performance, but I think that not copying could improve the ability to share the contents of higher-level caches that are shared between cores.

In any case, compare and decide based on actual measurements.

+1
source

Source: https://habr.com/ru/post/916535/


All Articles