Heroku Standby Time

Yesterday, I ran a load test on my Rails application by running 8 speakers with three parallel Unicorn processes on each. This is the new Relic output:

newrelic

As you can see, the Rails stack itself has pretty good response time (DB, Web, etc.), but the queue time is super terrible.

What can I do about this? Is this an integral part of Heroku's performance, or does it just mean that I need to add more speakers?

Any advice is appreciated.

+6
source share
2 answers

Basically, break the problem into pieces and check each piece. Just throwing a bunch of requests into a unicorn cluster is not necessarily a good way to measure throughput. You should consider many variables (side note: checkout "Programmers need to find out statistics, or I'll kill them all" Zed Shaw )

In addition, you leave critical information from your question to solve the mystery.

  • How many requests for processing a unicorn per second?
  • How long does the general test take and do you allow time for any cache you need to warm up?
  • How many requests were processed by the collection?
  • I see in the diagram that the queue time drops significantly from the initial spike on the left side of the diagram - any idea why? Is this launch time? Does this cache warm? Is this a stream of requests disproportionately at the beginning of the test?

You are the only person who can answer these questions.

Waiting time, if I understand Heroku’s setup correctly, is essentially the time when new requests are sitting in wait for an available unicorn (or, more precisely, with a unicorn, how long they are asked to sit until they are caught by a unicorn). If you are experiencing a load and loading the system more than it can handle, then when your application by itself asks that it is ready to work very quickly, there will still be a lag in requests waiting for an available unicorn to process it.

Depending on the initial setup, try using the following variables in your test:

  • The same number of common requests, but run it longer to see if the caches will heat up more and speed up the response time (i.e. unicorns process more requests per second).
  • Adjust the number of requests per second for a common collection of available unicorns, both up and down, and watch at which thresholds the queue time is getting better and worse.
  • Simplify the test. First, just test one unicorn process and find out how long it takes to warm up, how many requests per second it can process, and at what point the time of the queues starts to increase due to lag. Then add unicorn processes and repeat the tests, trying to find out with 3 unicorns you get 3x performance, or if there is some% overhead when adding more unicorns (for example, overhead for balancing incoming requests) and that the overhead is negligible or not, etc.
  • Make sure all queries are very similar. If you have any queries that simply return the first page with 100 percent cached and non-dynamic content, the processing time will be much shorter than queries that should generate a variable amount of dynamic content, which is going to reset the test results significantly.

Also, find out if the above table of test results shows an average or 95th percentile with standard deviations or some other measurements.

Only after you break the problem in its component parts will you know with any predictability whether adding more unicorns will help. Looking at this base chart and asking, “Should I just add more unicorns?” looks like a slow computer and asks: “Should I just add more RAM to my machine?”. While this may help you skip steps to understand why something is happening slowly, and adding more to it, although it may help, will not give you a deeper understanding of why it is slower. Because of this (and especially on heroku), you can overpay for more dynamic processors when you do not need them, if you could get to the root , which causes more than expected in the queue once you will be in much better shape.

This approach, of course, is not unique to Heroku. Trying experiments, changing variables, and recording measurement results will let you understand what is going on inside these performance indicators. Understanding why will allow you to take concrete, educated steps that should have mostly predictable consequences for overall performance.

After all this, you may find that “yes”, the best way to improve performance in your particular case is to add more unicorns, but at least you will know why and when to do this, as well as a really reliable assumption about . how much to add.

+4
source

I essentially wrote another question, and then leaned back and realized that I had just edited this exact question a week before, and knew the answer to both.

What jefflunt said is basically 100% true, but since I'm here, I'm here to describe it.

There are 2 solutions:

  • Add Unicorn Workers.
  • Reduce the total transaction time of requests.

They basically boil down to the same exact concept, but:

  • If you have 15 thousand transactions per minute, you will have 250 transactions per second.
  • If the average transaction time is 100 ms, each worker can perform 10 transactions per second (where 1000 ms / (100 ms / transaction)).
  • If you have 8 dinosaurs with 3 workers, you will have 24 workers.
  • 24 employees at 10 transactions per second mean that your current setup can produce about 240 transactions per second.

Of course, this is just the roughest structure for how to evaluate the problem, especially because the traffic is always weighted somehow, and the average (median) is usually better, because you take into account more than 95% of the requests, but you will be close to the correct number to understand what capacity you need.

0
source

Source: https://habr.com/ru/post/945783/


All Articles