High frequency concurrency for cache

I study caching and ask a question about concurrency cache.

As I know, LRU caching is implemented using a double linked list + hash table. Then how does the LRU cache handle highly frequent concurrency? Please note that both receiving data from the cache and putting data in the cache will update the linked list and hash table, therefore the cache is constantly changing.

If we use mutex locking for thread safety, is speed slowing if the cache is visited by a large number of people? If we do not use locking, what methods are used? Thanks in advance.

+6
source share
1 answer

Traditional LRU caches are not designed for high concurrency due to limited equipment and that the penalty per hit is much less than a miss (for example, finding a database). For most applications, a cache lock is acceptable if it is used only to update the underlying structure (does not calculate a miss value). Simple methods, such as segmenting LRU policies, were usually good enough when they started to block.

The way to create the LRU cache scale is to avoid updating the policy on every access. The criticism is that the cache user does not care what the current LRU order is. The only problem for the caller is that the cache supports threshold size and high hit ratio. This opens the door to optimization by avoiding mutating the LRU policy with every read.

The memcached approach is to cancel subsequent reads during a time window, for example. 1 second. The cache is expected to be very large, so there is a very low probability of evicting a poor candidate using this simpler LRU.

The approach of ConcurrentLinkedHashMap (CLHM) and then Guava Cache is a buffer access entry. This buffer merges under the LRU lock and with the help of try-lock no other operation should be blocked. CLHM uses several ring buffers, which are losses if the cache cannot keep up with it, since losing events are more likely to degrade performance.

The Ehcache and redis approach is a probabilistic LRU policy. Reading updates the timestamp of the record, and the record iterates through the cache to obtain a random sample. The oldest entry is taken out of this sample. If the sample is quickly built and the cache is large, the output is probably a good candidate.

There are probably other methods and, of course, LRU pseudo-policies (e.g. CLOCK) that offer higher concurrency at lower interest rates.

+9
source

Source: https://habr.com/ru/post/954453/


All Articles