Unicorn workers die for no reason

All unicorn workers die silently, no indication of why, and I cannot find any evidence that the external process is killing them. I'm new to diagnosing these kinds of things, and after hours of research, experimentation, and trying to figure it out, I'm at a standstill.

Background information is a Rails 4.1, Ruby 2.0 application running nginx and a unicorn on an Ubuntu 14.04 server.

unicorn.rb

working_directory "/home/deployer/apps/ourapp/current"
pid "/home/deployer/apps/ourapp/current/tmp/pids/unicorn.pid"
stderr_path "/home/deployer/apps/ourapp/current/log/unicorn.log"
stdout_path "/home/deployer/apps/ourapp/current/log/unicorn.log"

listen "/tmp/unicorn.ourapp.sock"
worker_processes 2
timeout 30

excerpt from unicorn.log (last lines before he dies and after restart)

I, [2016-08-28T19:54:01.685757 #19559]  INFO -- : worker=1 ready
I, [2016-08-28T19:54:01.817464 #19556]  INFO -- : worker=0 ready
I, [2016-08-29T09:19:14.818267 #30343]  INFO -- : unlinking existing socket=/tmp/unicorn.ourapp.sock
I, [2016-08-29T09:19:14.818639 #30343]  INFO -- : listening on addr=/tmp/unicorn.ourapp.sock fd=10
I, [2016-08-29T09:19:14.818807 #30343]  INFO -- : worker=0 spawning...
I, [2016-08-29T09:19:14.824358 #30343]  INFO -- : worker=1 spawning...

Some information:

  • After a period of time from 8 to 20 hours, the unicorn dies.
  • There are no errors in the unicorn log.
  • I searched everything /var/logto prove the processes that were killed, and can only find one unrelated process that was killed a few days ago.
  • , , 400 . 480mb , , .
  • ... 0,1%, .
  • . , , - New Relic Linode Longview.
  • .log , New Relic. Completed 200 OK in 264ms, .
  • , , .

:

  • , ?
  • , OOM , - , , - ?
  • , Unicorn ?

, , .

UPDATE

, strace, , crontab ( , ), , . , .

, - , strace ( - strace -o /tmp/strace.out -s 2000 -fp <unicorn_process_id>), strace +++ killed by SIGKILL +++. , .

, , , , strace .

+4

Source: https://habr.com/ru/post/1652915/


All Articles