How to analyze the key cache hit ratio with graphite?

I have a Rails application that makes extensive use of caching, and I want to know how effective cache rating is for different places in the application. Obviously, places with a low probability of being hit require attention. But first, measure it!

To get real data, I use graphite compilation + statsd and Dalli user toolkit using statsd-instrument gem . All keys in the application are in the form ['place', details...] , so I get the following indicators in the graph:

  • stats.cache.place1.hits
  • stats.cache.place1.misses
  • stats.cache.place2.hits
  • stats.cache.place2.misses
  • and etc.

Now I want to show all hit rates. I was able to come up with the following formula for one place:

 divideSeries(stats.cache.place1.hits, sumSeries(stats.cache.place1.*)) 

This works very well, but there are dozens of places, and I would not want to duplicate it, not to mention the fact that new places may appear.

This is a question for you, Graphite experts: is there a way to show hit rates in all places? I saw the group * functions in the docs, but they confuse me.

Ideally, I want to segment my seats into 4 categories:

  • High speed, many requests. Caching does a good job.
  • Low hit rate, many requests. It requires attention.
  • High speed, few requests. Do you need caching?
  • Low hit rate, few requests. Definitely remove caching.

I would be very grateful for any ideas on using graphite for such an analysis (I can request JSON data and do my own math, but I suspect there should be an easier way).

+6
source share
1 answer

You can use globs at several levels, so for a global view of how all caching is done:

 divideSeries(stats.cache.*.hits, sumSeries(stats.cache.*.*)) 

For the 4 categories you mentioned, the mostDeviant function may be good , which will help you find the highest / lowest cache level.

 mostDeviant(5, divideSeries(stats.cache.*.hits, sumSeries(stats.cache.*.*))) 

Grouping them into buckets based on queries, and then showing a separate derivative coefficient is more difficult. Callback functions using duplicate groupByNode and highestAverage can work

 highestAverage(groupByNode(groupByNode(stats.cache.*.*, 3, "sumSeries"), 2, "divideSeries"), 10) 

As a side note, with most LRU cache outreach schemes (the least recently used), it makes no sense to stop caching because it will not compete for cache space.

+2
source

Source: https://habr.com/ru/post/972684/


All Articles