Cache invalidation strategy

In my current application, we are dealing with some information that rarely changes .
To optimize performance, we want to store them in the cache .
But the problem is that these objects are canceled every time they are updated .
We have not finalized the caching product.
Since we are building this application on Azure, we are likely to use Azure Redis cache .
One strategy might be to add code to the Update API , which invalidates the cache object.
I'm not sure if this is the clean way?
We do not want to use time- based cache expiration (TTL).
Could you suggest some other strategies used to invalidate the cache?

+15
source share
3 answers

Revoking the cache during the upgrade phase is a viable approach and has been used extremely rarely in the past.

You have two options when an UPDATE occurs:

  1. You can try to set a new value during the update operation or
  2. Just delete the old one and update during the read operation .

If you need an LRU cache , then UPDATE can simply delete the old value, and when you first get the object, you will create it again after reading from the real database. However, if you know that your cache is very small and you use a different main database for tasks other than the size of the data, you can update directly during the UPDATE.

However, all this is not enough to be completely consistent.
When you write to your database, the Redis cache may not be available, for example, for several seconds, so the data between them is not synchronized.
What are you doing in this case?
There are several options that you can use at the same time.

  1. In any case, set TTL so that eventually the corrupted data is updated.
  2. Use lazy reader repair. When you read from a database, from time to time check at the main, whether value matches. If not, update the cached item (or delete it).
  3. Use eras or similar ways to access your data. Not always possible, however sometimes you get access to cached data about this object. Whenever possible, you can change the object identifier / descriptor every time you change it, so it is impossible to access obsolete data in the cache: each key name refers to a specific version of your object.

Thus, del-cache-on-update and write-cache-on-read is a basic strategy, but you can use other additional systems to resolve inconsistencies.

In fact, there is another option instead of using the above options, which is a background process that uses Redis SCAN to check key by key, if there are inconsistencies. This process can be slow and work with a copy of your database.

As you can see, the basic idea is always the same: if the cache update fails, do not make it a permanent problem that may remain there forever, give it the opportunity to fix itself later.

+22
source

I think lambda (ish) architecture is suitable for your use case.

  1. Real-time updates for immediate business use
  2. Batch upload data to fix any erroneous entries
  3. Batch upload data to delete any invalid / archived records.

For real-time updates, you will have to work with the application code base to write data to both the database and the cache.

To batch download data, you can use data receiving tools such as logstash / fluentd to β€œextract” the latest data from your database. This can be done based on a column that is always incrementing (ID number or timestamp).

I have an Oracle database at my end. The Logstash JDBC plugin does a great job with the latest entries. Logstash output can be formatted and printed to a file that Redis can use. I wrote a little bash script to organize this. Tested on 3 million records and works well.

0
source

Besides TTL (Time To Live, for example, deleting keys after a specified period of time), the usual strategy is LRU (Least Recent Used): when you need to delete some records (for example, when you reach the maximum memory limit), you delete the oldest records .

However, for a long time you need to know exactly which record is the oldest, so Redis uses a statistical approach, taking several entries by default (3 by default) and deleting the last one used).

see http://redis.io/topics/lru-cache

-4
source

Source: https://habr.com/ru/post/986966/


All Articles