Cache-related performance optimization methods?

There are many performance issues related to the cache. I have a few questions about them:

  • Probably the most common problems are cache location and false cache sharing . Any others?
  • any good review?
  • Are there any proven methods to deal with them?
  • What are the general characteristics of applications where they are real problems? computationally intensive fields (math / image processing, etc.)? high parallel applications?
+4
source share
1 answer

One of the most interesting is to avoid cache collisions. If you know the memory access pattern, you can display the available elements in such a way as to minimize the collision of cache lines between the available data. You can do this for data and code.

Identifying data access patterns is relatively difficult, but you can relatively easily define code access patterns. Given the call graph, the many blocks that make up the function bodies, as well as some estimates of the transition frequencies between the blocks, you can assign cache code blocks in such a way as to maximize the likelihood that the next block you will need to be in some other cache line that does not conflict with current. One interesting idea was that you only had to assign blocks of code that were "hot" (high probability of execution); no matter where you put the cold. IIRC, this means that you can sort the blocks by the frequency of probable execution, and then assign them in that order.

You just need a global analysis: -} The first thing I read about this, the optimizer was actually implemented as part of the linker, which is one way to access the entire program.

I don’t remember either a good review, or a set of compiled methods. However, PLDI conferences have research papers on this topic.

+4
source

Source: https://habr.com/ru/post/1339591/


All Articles