Can a slow network force a Python application to use * more * CPU?

Let's say we have a system like this:

______ { application instances ---network--- (______) { application instances ---network--- | | requests ---> load balancer { application instances ---network--- | data | { application instances ---network--- | base | { application instances ---network--- \______/ 

The request arrives, the load balancer sends it to the application server instance, and the application server instances access the database (elsewhere on the local network). Application instances can be either separate processes or separate threads. To cover all the bases, let's say there are several identical processes, each of which has a pool of identical application application threads.

If the database is slow or the network gets bogged down, it is obvious that the throughput of query services will be degraded.

Now, in all my experiments before Python, this will be accompanied by a corresponding decrease in CPU usage by application instances - they will spend more time blocking I / O and less time doing things that are CPU intensive.

However, they tell me that this is not the case with Python - under certain circumstances of Python, this situation can lead to an increase in the use of Python CPUs, possibly up to 100%. Something about Global Interpreter Lock and multiple threads seems to be causing Python to spend all his time switching between threads, checking to see if any of them still have a response from the database. "In this regard, libraries based on events related to events have recently grown."

It is right? Do Python application application threads affect more processors when their I / O latency increases?

+4
source share
4 answers

In theory, in practice, it is not possible; it depends on what you do.

There is a full hour of video and pdf about it , but in essence it comes down to some unforeseen consequences of the GIL with threads associated with the processor and IO, with multi-core processors. Basically, a thread waiting for I / O should wake up, so Python starts โ€œpre-emptingโ€ the other threads with each tick tick (instead of every 100 ticks). As a result, the IO thread has problems with the GIL from the CPU thread, causing the loop to repeat.

This is greatly simplified, but it is the point. Videos and slides have more information. This manifests itself and poses a big problem for multi-core machines. This can also happen if the process received signals from os (since it also calls thread switching code).

Of course, as other posters said, this goes away if everyone has their own process.

By the way, slides and videos explain why you sometimes cannot use CTRL + C in Python.

+6
source

The key is to run application instances in separate processes. Otherwise, multithreading problems are likely to occur.

+1
source

No, it is not. Stop the spread of FUD.

If your python application is blocked by a call to the C API ex. blocking sockets or reading a file, it may have released GIL.

+1
source

Something about Global Interpreter Lock and multiple threads seems to be causing Python to spend all his time switching between threads, checking to see if any of them still have an answer from the database.

This is completely unfounded. If all threads are blocked during I / O, Python should use 0% CPU. If there is one unlocked thread, it will be able to work without GIL permission; it will periodically issue and reload the GIL, but it does not do any work of checking other threads.

However, in multi-core systems, it may take some time for the thread to retrieve the GIL if there is a CPU-related thread, and for the drop response time (see this presentation ). However, this should not be a problem for most servers.

+1
source

Source: https://habr.com/ru/post/1286789/


All Articles