When does the Google App Engine start or stop an instance?

We have an App Engine application that processes an average of 0.5 requests per second, and it would seem that all these requests can be processed by the same instance that launches the Go application as the main version.

However, sometimes App Engine starts a second instance (and sometimes a third), which seems to do nothing by processing one or two requests. Here is an example.

enter image description here

Turning off this instance manually does not seem to do any harm, so my question is: why doesn't App Engine kill the instance after it has not received any requests for a while? (In the above example, there were four requests in the last hour, often the request / age ratio became even lower).

Update:

A similar situation occurs when an instance is launched in a different version. App Engine only seems to kill the instance after several hours without receiving any requests.

In the "Application Settings" β†’ "Performance" section,

  • Idle Instances is set to Automatic - 20
  • The expected delay is set to 150 ms - 250 ms
+4
source share
2 answers

I wish I knew what it controls if / when it kills unused instances, but I don't see any documentation on it.

To avoid running extra instances, I think the main thing you can do here is to increase the waiting delay:

The Pending Latency slider determines how long requests spend in the pending queue before it is served by the default instance of your application. If the minimum expected latency is high, App Engine will allow requests to wait, rather than launch new instances to process them. This can reduce the number of instance instances that your application uses, but can lead to a longer delay in user visibility.

Even if you are only 4 requests / hour, if you manage to get two closely spaced each other, I believe that this can lead to the creation of a new instance.

You can also see a small amount of information in the logs about why he launched the new instance.

+4
source

Like the app scale in the Google App Engine Documentation :

Instance Scaling

Each instance has its own queue for incoming requests. App Engine keeps track of the number of requests waiting in line for each instance. If the App Engine detects that the queues for the application are getting too long due to the increased load, it automatically creates a new instance of the application to handle this download.

App Engine also scales instances in reverse when query volumes decrease. This scaling helps ensure that all of your current application instances are used for optimal efficiency and cost-effectiveness.

It also states that you can specify the minimum number of unoccupied instances "and" optimize for high performance or low cost "in the admin console.

Try setting the "Idle instance" field to something like 3 - 5 and "optimize for low cost" and see if this affects the instance destruction time.

+2
source

Source: https://habr.com/ru/post/1492954/


All Articles