Problems with the Celery Demon

Question

Problems with the Celery Demon

We have problems with our celery demon, which is very flaky. We use a cloth deployment script to restart the daemon whenever we push the changes, but for some reason this causes serious problems.

Whenever a script is deployed, celery processes remain in some pseudo-dead state. They (unfortunately) still consume jobs from rabbitmq, but they will actually do nothing. Vaguely, a brief overview would show that everything seems “perfect” in this state, celeryctl status shows one node online and ps aux | grep celery shows 2 running processes.

However, when you try to start /etc/init.d/celeryd, manually stopping results in the following error:

start-stop-daemon: warning: failed to kill 30360: No such process

While in this state, the attempt to start celeryd start works correctly, but actually does nothing. The only way to fix the problem is to manually kill the running celery processes and then restart them.

Any ideas what is going on here? We also do not have full confirmation, but we think that the problem also develops in a few days (without activity it is a test server at present) on it without deployment.

+6

django daemon celery start-stop-daemon django-celery

John Jul 01 '11 at 17:15

source share

1 answer

Idan gazit · Answer 1 · 2011-08-01T06:54:43+0000

I can’t say that I know how sick your installation is, but I always used supervisord to launch celery - maybe the problem is with the upstart? Despite this, I never experienced this when celery worked on a supervisor.

For good measure, here is an example of a supervisor configuration for celery:

 [program:celeryd] directory=/path/to/project/ command=/path/to/project/venv/bin/python manage.py celeryd -l INFO user=nobody autostart=true autorestart=true startsecs=10 numprocs=1 stdout_logfile=/var/log/sites/foo/celeryd_stdout.log stderr_logfile=/var/log/sites/foo/celeryd_stderr.log ; Need to wait for currently executing tasks to finish at shutdown. ; Increase this if you have very long running tasks. stopwaitsecs = 600

Restarting celeryd in my factory script is then as simple as issuing sudo supervisorctl restart celeryd .

Problems with the Celery Demon

More articles: