Deploying a web application with fabric and supervisor - SIGHUP causes the server to be interrupted

Question

Deploying a web application with fabric and supervisor - SIGHUP causes the server to be interrupted

We use the manager to deploy the python web application. When deployed, the web application is installed on the server using the assembly, and the script to launch the supervisor is created using the collect.recipe.supervisor command. This script is called at the end of the deployment process by the cloth script. The problem is that when the script is deployed, a SIGHUP signal is sent to the process, which forces the supervisor to restart (according to this line: https://github.com/Supervisor/supervisor/blob/master/supervisor/supervisord.py#L300 ), but for some reason, the web application does not restart after its completion. There is no log output after the following:

2012-10-24 15:23:51,510 WARN received SIGHUP indicating restart request 2012-10-24 15:23:51,511 INFO waiting for app-server to die 2012-10-24 15:23:54,650 INFO waiting for app-server to die 2012-10-24 15:23:57,653 INFO waiting for app-server to die 2012-10-24 15:24:00,657 INFO waiting for app-server to die 2012-10-24 15:24:01,658 WARN killing 'app-server' (28981) with SIGKILL 2012-10-24 15:24:01,659 INFO stopped: app-server (terminated by SIGKILL)

So, I have two questions. First: does anyone know why the supervisor reboots to SIGHUP? I could not find an explanation for this, and there are no command line options that would disable this behavior. Second question: how to fix the problem we are facing? We tried to start the dispatcher with nohup, but SIGHUP is still received. The strange thing is that this does not happen when I log in to the server, start the dispatcher manually and log out.

Here is the supervisor script generated by buildout:

 #!/usr/bin/python2.6 import sys sys.path[0:0] = [ '/home/username/.buildout/eggs/supervisor-3.0b1-py2.6.egg', '/home/username/.buildout/eggs/meld3-0.6.9-py2.6.egg', '/home/username/.buildout/eggs/distribute-0.6.30-py2.6.egg', ] import sys; sys.argv.extend(["-c","/home/username/app_directory/parts/supervisor/supervisord.conf"]) import supervisor.supervisord if __name__ == '__main__': sys.exit(supervisor.supervisord.main())

And here is the configuration file for the supervisor, also created by buildout:

 [supervisord] childlogdir = /home/username/app_directory/var/log logfile = /home/username/app_directory/var/log/supervisord.log logfile_maxbytes = 50MB logfile_backups = 10 loglevel = info pidfile = /home/username/app_directory/var/supervisord.pid umask = 022 nodaemon = false nocleanup = false [unix_http_server] file = /home/username/app_directory/supervisor.sock username = username password = apasswd chmod = 0700 [supervisorctl] serverurl = unix:///home/username/app_directory/supervisor.sock username = username password = apasswd [rpcinterface:supervisor] supervisor.rpcinterface_factory=supervisor.rpcinterface:make_main_rpcinterface [program:app-server] command = /home/username/app_directory/bin/gunicorn --bind 0.0.0.0:5000 app:wsgi process_name = app-server directory = /home/username/app_directory/bin priority = 50 redirect_stderr = false directory = /home/username/app_directory

We do not want to install a fixed version of the dispatcher before we really understand the problem, so any information will be highly appreciated.

Thank you in advance

+4

python deployment fabric supervisord

Ulas Turkmen Oct 24 '12 at 13:38

source share

4 answers

mechmind · Answer 1 · 2012-10-25T13:55:04+0000

Restarting or rebooting on SIGHUP is a common practice in system programming for Linux. The question is why you get SIGHUP after deployment is complete. Since the supervisor correctly demonizes (because you can start it and exit the system and it will work), a reboot signal can be sent to the supervisor by creating a bot, which indicates that webapp needs to be restarted because the code has been changed.

Thus, the dispatcher initiates the shutdown of the application in order to start the application with new code. But the application does not stop at the specified timeout, and the supervisor decides that the application freezes and kills it with SIGKILL .

To solve the problem, you need to teach the application to shut down when the supervisor asks for it.

mikewaters · Answer 2 · 2013-02-08T23:32:37+0000

The documents of the supervisor clearly indicate that sending SIGHUP to the supervisor process will "stop all processes, reload the configuration from the first configuration file found and restart all processes."

ref - http://supervisord.org/running.html#signal-handlers

Your process may be wrong; it seems that the supervisor made several attempts to close it well, but then decided that he needed a heavy kill:

 process.py:560 # kill processes which are taking too long to stop with a final # sigkill. if this doesn't kill it, the process will be stuck # in the STOPPING state forever. self.config.options.logger.warn( killing %r (%s) with SIGKILL' % (self.config.name, self.pid)) self.kill(signal.SIGKILL)

Maybe the kill call doesn't work?

Peter Eisentraut · Answer 3 · 2013-02-28T16:35:09+0000

You may have encountered this error: https://github.com/Supervisor/supervisor/issues/121

A workaround would be to lower the supervisor until it is fixed in the released version.

ilya b. · Answer 4 · 2013-05-21T13:48:56+0000

Let's move on to exactly the same problem, lowering it to 3.0a10.

Deploying a web application with fabric and supervisor - SIGHUP causes the server to be interrupted

More articles: