Running the Django Celery Queue

I am using Celery / RabbitMQ to perform an asynchronous task with my django application. I just started working with Celery.

Tasks are performed and everything works fine as soon as I get started.

The problem is that task execution stops after a while. After a couple of hours, a day or a couple of days. I understand that only from the consequences of incomplete executions. Then I restart the celery, and all pending tasks are completed, and everything returns to normal.

My questions:

  • How can I debug (where to start looking) to find out what the problem is?
  • How to create a mechanism that notifies me immediately after starting a problem?

My stack: Django 1.4.8 Celery 3.1.16 RabbitMQ Supervisord

Thanks Andy

+4
source share
1 answer

(1) If your worker celery is sometimes stuck, you can use strace & lsofto find out in which system call he is stuck.

For instance:

$ strace -p 10268 -s 10000
Process 10268 attached - interrupt to quit
recvfrom(5,

10268 is the pid of the celery worker, recvfrom(5meaning that the worker stops when receiving data from the file descriptor.

Then you can use lsofto check what is 5in this workflow.

lsof -p 10268
COMMAND   PID USER   FD   TYPE    DEVICE SIZE/OFF      NODE NAME
......
celery  10268 root    5u  IPv4 828871825      0t0       TCP 172.16.201.40:36162->10.13.244.205:wap-wsp (ESTABLISHED)
......

This indicates that the worker is stuck in the tcp connection (you can see 5uin the column FD).

python, requests, , , requests, timeout.

(2) RabbitMQ, , , .


:

https://www.caktusgroup.com/blog/2013/10/30/using-strace-debug-stuck-celery-tasks/

+10

Source: https://habr.com/ru/post/1569666/


All Articles