Avoiding Drilling Database Queries Using a Caching Scheme

We use a PostgreSQL database and AppFabric Server running on a moderate-load ASP.NET MVC e-commerce site.

Following the caching pattern, we request data from our cache, and if it is not available, we request a database.

This approach leads to “query storms” when the database receives several queries for the same data in a short period of time, while this object is updated in the cache. This problem is exacerbated by longer running queries, and obviously, multiple queries for the same data can cause the query to last longer, creating an unpleasant feedback loop.

One solution to this problem is to use read locks in the cache. However, this can lead to performance problems in the situation of the web farm (or even on one busy web server), since the web servers are blocked when reading without any reason if there is a query to the database.

Another solution is to abandon the caching pattern and align the cache independently. This is the approach that we have taken to mitigate the immediate problems that we see with this problem, but this is not possible in all the data.

Am I missing something? And what other approaches do people take to avoid this behavior?

+4
source share
2 answers

Depending on the number of servers you have and your current cache architecture, it may be worthwhile to evaluate the addition of cache at the server level (or in the process). In fact, you use this as a backup cache, and it is especially useful when getting into the main storage (database) is either very resource-intensive or slow.

When I used this, I used the caching pattern for the primary cache and the pass-through design for the secondary - in which the secondary lock also ensures that the database is not overloaded with the same query. In this architecture, a primary cache miss results in a database reaching at most one entity request per server (or process).

So, the main workflow:

1) Try to extract primary / shared caches from the pool

* If successful, return * If unsuccessul, continue 

2) Check cache in process for value

 * If successful, return (optionally seeding primary cache) * If unsuccessul, continue 

3) Get a lock using the cache key (and double-check the cache process, if it was added by another thread)

4) Extract object from primary persistence (db)

5) In-process cache and return

I did this with injectable wrappers, my cache layers implement the corresponding IRepository interface, and StructureMap introduces the correct cache stack. This allows flexible, focused and easy to maintain the actual behavior of the cache, despite the fact that it is quite complex.

+1
source

We have successfully used AppFabric with your seeding strategy. We really use both solutions:

  • If possible, put the known data (we have a limited set, so it’s easy for us to understand)
  • Within each method of access to the cache, it is necessary to search if necessary and fill the cache when retrieving from the data warehouse.

Appearance is needed because the elements can be deduced due to pressure in the memory or simply because they were skipped during the sowing process. We have a “warming up” service that pulsates at intervals (hours) and stores a cache with the necessary data. We continue to analyze cache misses and use this to adjust our warming strategy if we see frequent misses during the heating interval.

0
source

Source: https://habr.com/ru/post/1343162/


All Articles