Key Groups with APC Cache

APC allows you to store data inside keys, but you cannot group these keys.

So, if I want to have a group called "articles" and inside this group I will have keys that take the form of an article identifier, I cannot do this easily.

articles -> 5 -> cached data -> 10 -> cached data -> 17 -> cached data ... 

I could prefix the key with the name "group", for example:

 article_5 -> cached data article_10 -> cached data article_17 -> cached data ... 

But this makes it impossible to delete the entire group if I want: (

A working solution would be to store multidimensional arrays (this is what I am doing now), but I do not think it is good, because when I want to get / delete cached data, I need to get the whole group first. So if there are one million articles in a group, you can imagine which array I will repeat and search

Do you have any better ideas on how I can achieve group things?


edit: found another solution, not sure if this is much better, because I don’t know how reliable it is. I am adding a special key called __paths , which is basically a multidimensional array containing the full prefix key paths for all other entries in the cache. And when I request or delete the cache, I use this array as a reference to quickly find out the key (or group of keys) that I need to delete, so I do not need to store arrays and iterate over all the keys ...
+6
source share
4 answers

Based on your observations, I looked at the base implementation of the C APC caching model ( apc_cache.c ) to see what I could find.

The source confirms your observation that there is no grouping structure in the backup data warehouse, so any freely-grouped set of objects must be based on some restriction on the namespace or changes in the cache level itself. I was hoping to find some kind of backdoor that relies on a keychain through a linked list, but unfortunately it seems that the clashes were reconciled by directly redistributing the oncoming slot instead of chaining .

To further confuse this problem, APC seems to use an explicit cache model for user records, preventing them from aging. So, the Emil Vikström solution provided , which relies on the LRU memcached model will, unfortunately, not work.

Without changing the source code of APC itself, here is what I would do:

  • Define a namespace restriction that matches your entries. As you have defined above, it will be something like article_ added to each of your posts.

  • Define a separate list of items in this set. Effectively this would be the 5 , 10 and 17 schema that you described above, but in this case you could use some number type to make it more efficient than storing a large number of string values.

  • Define an interface for updating this set of pointers and matching them with the backup cache, including (at a minimum) the insert , delete and clear methods. When clear is called, go through each of your pointers, restore the key that you used in the backup data store, and clear each of your cache.

What I'm defending here is a well-defined object that performs the operations that you are looking for efficiently. This scales linearly with the number of entries in your cache, but since you use a numeric type for each item, you will need more than 100 million entries or so before you begin to experience real pain in memory with a limit of, for example, several hundred megabytes.


Tamas Imrei beat me up by suggesting an alternative strategy. I was already in the process of documenting, but I have some serious flaws that I would like to discuss.

As defined in the C support code, APCIterator is a linear temporary operation on the complete data set when performing a search (using the constructor, public __construct ( string $cache [, mixed $search = null ...]] ) ).

This is absolutely undesirable if the background elements you are looking for represent a small percentage of your total data, since it will move every single element of your cache to find what you need. Quoting apc_cache.c :

 /* {{{ apc_cache_user_find */ apc_cache_entry_t* apc_cache_user_find(apc_cache_t* cache, char *strkey, \ int keylen, time_t t TSRMLS_DC) { slot_t** slot; ... slot = &cache->slots[h % cache->num_slots]; while (*slot) { ... slot = &(*slot)->next; } } 

Therefore, I would highly recommend using an efficient, pointer-based virtual grouping solution for your problem, as I sketched above. Although in the case when you are very limited in memory, the iterator approach may be the most correct for saving as much memory as possible through calculation.

Best of luck with your application.

+18
source

I had one problem with memcached, and I solved it using the version number in my keys, for example:

 version -> 5 article_5_5 -> cached data article_10_5 -> cached data article_17_5 -> cached data 

Just change the version number and the group will be effectively “gone”!

memcached uses a recent policy to delete old data, so the old version will be removed from the cache when necessary. I do not know if APC has the same function .


According to MrGomez , this does NOT work for APC. Please read his post and save my post only for other caching systems that use the latest policy (and not APC).

+4
source

You can use the APCIterator class , which seems to exist especially for such tasks:

The APCIterator class simplifies iteration over large APC caches. This is useful because it allows you to iterate over large caches in stages ...

+3
source

Unfortunately, APC cannot do this. I very often wanted this to be possible. So I was looking for alternatives.

Zend_Cache has an interesting way to do this, but it just uses caches to cache tag information. This is a component that, in turn, can use backends (e.g. apc).

If you want to take one more step, you can install Redis . It has everything that is initially included, and some other really interesting features. This is probably the cleanest solution. If you were able to use APC, you can also use Redis.

+1
source

Source: https://habr.com/ru/post/907695/


All Articles