Writing in Python causes delays?

Question

Writing in Python causes delays?

I am developing a real-time REST API using Python3 + bottle / UWSGI. I experienced delays in my code, sometimes 100 ms, which matters in my application.

Using the logging module, I tried to determine the slow parts of my code by printing how long the individual code blocks had been running. I know this is a very bad way to profile code, but sometimes it can do a good job.

Despite the fact that I detected some slow parts, I still did not see anything - the individual parts seemed to take 10 ms, but very often they took 100 ms in total. After some more crazy experiments that drove me almost completely into nuts, I came to the following:

 t = round(100*time.time()) logging.info('[%s] Foo' % t) logging.info('[%s] Bar' % t)

Surprisingly, it gives:

 2014-07-16 23:21:23,531 [140554568353] Foo 2014-07-16 23:21:24,312 [140554568353] Bar

Although it seems hard to believe, there are two consecutive calls to logging.info() , and for some reason there is a gap of almost 800 ms between them. Can someone tell me what is going on? It should be noted that if there are several calls to info() , the timeout appears only once in the entire API method, most often at the very beginning (after the first call). My only hypothesis is disk latency, there are several (but not so many!) Workers working at the same time, and I use a rotary disk, not an SSD. But I thought there were buffers and that the OS would do a flash drive asynchronously for me. Am I mistaken in my assumptions? Should I completely avoid registration to avoid delays?

EDIT

Based on the Vinay Sajip suggestion, I switched to the following initialization code:

 log_que = queue.Queue(-1) queue_handler = logging.handlers.QueueHandler(log_que) log_handler = logging.StreamHandler() queue_listener = logging.handlers.QueueListener(log_que, log_handler) queue_listener.start() logging.basicConfig(level=logging.DEBUG, format="%(asctime)s %(message)s", handlers=[queue_handler])

It seems to work fine (if I comment on queue_listener.start() , there is no way out), but still the same delays appear. I do not understand how this is possible, the call should be non-blocking. I also put gc.collect() at the end of each request to ensure that the problem is not caused by the garbage collector - without any effect. I also tried disabling all-day logging. The delays are gone, so I think their source should be in the logging module ...

+6

performance python logging latency

Tregoreg Jul 16 '14 at 21:49

source share

3 answers

Try asynchronous logging provided by the log

As hasan suggested, an asynchronous log handler can be a way.

I recently tried using Logbook and I can say that it will provide you with everything you need for this - ZeroMQHandler as well as ZeroMQSubscriber

+3

Jan Vlcinsky Jul 16 '14 at 22:11

source share

This may depend on the logging handler. My experience is that, for example, PostgreSQL, used as a registration handler, is a very poor choice for speed. FileHandler can give you very good results, but if your system is very heavy, then even simple file writing can be slow. I suggest using some asynchronous handler, for example. send logs via UDP to the dedicated process.

+1

hasan Jul 16 '14 at 21:58

source share

Vinay sajip · Accepted Answer · 2014-07-17T08:05:35+0000

You can use asynchronous handlers ( QueueHandler and the corresponding QueueListener added in Python 3.2 and described in this post ) and do the input / output processing of registration events in a separate thread or process.

Writing in Python causes delays?

Try asynchronous logging provided by the log

More articles: