Based on your observations, I looked at the base implementation of the C APC caching model ( apc_cache.c
) to see what I could find.
The source confirms your observation that there is no grouping structure in the backup data warehouse, so any freely-grouped set of objects must be based on some restriction on the namespace or changes in the cache level itself. I was hoping to find some kind of backdoor that relies on a keychain through a linked list, but unfortunately it seems that the clashes were reconciled by directly redistributing the oncoming slot instead of chaining .
To further confuse this problem, APC seems to use an explicit cache model for user records, preventing them from aging. So, the Emil Vikström solution provided , which relies on the LRU memcached model will, unfortunately, not work.
Without changing the source code of APC itself, here is what I would do:
Define a namespace restriction that matches your entries. As you have defined above, it will be something like article_
added to each of your posts.
Define a separate list of items in this set. Effectively this would be the 5
, 10
and 17
schema that you described above, but in this case you could use some number type to make it more efficient than storing a large number of string values.
Define an interface for updating this set of pointers and matching them with the backup cache, including (at a minimum) the insert
, delete
and clear
methods. When clear
is called, go through each of your pointers, restore the key that you used in the backup data store, and clear each of your cache.
What I'm defending here is a well-defined object that performs the operations that you are looking for efficiently. This scales linearly with the number of entries in your cache, but since you use a numeric type for each item, you will need more than 100 million entries or so before you begin to experience real pain in memory with a limit of, for example, several hundred megabytes.
Tamas Imrei beat me up by suggesting an alternative strategy. I was already in the process of documenting, but I have some serious flaws that I would like to discuss.
As defined in the C support code, APCIterator
is a linear temporary operation on the complete data set when performing a search (using the constructor, public __construct ( string $cache [, mixed $search = null ...]] )
).
This is absolutely undesirable if the background elements you are looking for represent a small percentage of your total data, since it will move every single element of your cache to find what you need. Quoting apc_cache.c
:
apc_cache_entry_t* apc_cache_user_find(apc_cache_t* cache, char *strkey, \ int keylen, time_t t TSRMLS_DC) { slot_t** slot; ... slot = &cache->slots[h % cache->num_slots]; while (*slot) { ... slot = &(*slot)->next; } }
Therefore, I would highly recommend using an efficient, pointer-based virtual grouping solution for your problem, as I sketched above. Although in the case when you are very limited in memory, the iterator approach may be the most correct for saving as much memory as possible through calculation.
Best of luck with your application.