NB This is not a PHP session_start () trick that causes HTTP requests to hang (and other similar questions asked on SO) since my hang is random and not constant.
Using Ubuntu 12.04, Magento , PHP-FPM (5.4) and the default PHP session handler (with files on ext4).
By the way, (once per month) all PHP processes hang on session_start() (according to fpm-slow.log):
[24-Sep-2014 11:03:04] [pool www] pid 24259 script_filename = /data/web/public/index.php [0x00007f00b4ec6480] session_start() /data/web/public/includes/src/__default.php:7687 [0x00007f00b4ec6130] start() /data/web/public/includes/src/__default.php:7730 [0x00007f00b4ec5fb8] init() /data/web/public/includes/src/__default.php:8086 [0x00007f00b4ec5e30] init() /data/web/public/includes/src/__default.php:33902 [0x00007f00b4ec5bd0] __construct() /data/web/public/includes/src/__default.php:23841 [0x00007f00b4ec5ae8] getModelInstance() /data/web/public/app/Mage.php:463 [0x00007f00b4ec59c8] getModel() /data/web/public/app/Mage.php:477 [0x00007f00b4ec49a0] getSingleton() /data/web/public/includes/src/__default.php:14044 [0x00007f00b4ec4848] preDispatch() /data/web/public/includes/src/Mage_Adminhtml_Controller_Action.php:160 [0x00007f00b4ec3b00] preDispatch() /data/web/public/includes/src/__default.php:13958 [0x00007f00b4ec26e0] dispatch() /data/web/public/includes/src/__default.php:18331 [0x00007f00b4ec20c0] match() /data/web/public/includes/src/__default.php:17865 [0x00007f00b4ec1a98] dispatch() /data/web/public/includes/src/__default.php:20465 [0x00007f00b4ec1908] run() /data/web/public/app/Mage.php:684 [0x00007f00b4ec17f8] run() /data/web/public/index.php:87
Lsof says:
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME php5-fpm 24259 app 10uW REG 202,1 82492 1220594 /data/web/public/var/session/sess_gr2clur9icgd7s2j9linag7ue6 php5-fpm 24262 app 10u REG 202,1 82492 1220594 /data/web/public/var/session/sess_gr2clur9icgd7s2j9linag7ue6 php5-fpm 24351 app 10u REG 202,1 82492 1220594 /data/web/public/var/session/sess_gr2clur9icgd7s2j9linag7ue6 php5-fpm 24357 app 10u REG 202,1 82492 1220594 /data/web/public/var/session/sess_gr2clur9icgd7s2j9linag7ue6 php5-fpm 24358 app 10u REG 202,1 82492 1220594 /data/web/public/var/session/sess_gr2clur9icgd7s2j9linag7ue6 php5-fpm 25563 app 10u REG 202,1 82492 1220594 /data/web/public/var/session/sess_gr2clur9icgd7s2j9linag7ue6 php5-fpm 25564 app 10u REG 202,1 82492 1220594 /data/web/public/var/session/sess_gr2clur9icgd7s2j9linag7ue6
According to strace, all of these processes are waiting for the flock (LOCK_EX) , even one who has the W flag in the lsof output above.
CPU usage for this incident is around 0.
So why does the first session_start hang, even though it seems to have acquired a write lock on the session file? How could I debug this further?
Here's a discussion called " race condition with ajax and php sessions ." In fact, the requests that cause the problem above are persistent AJAX calls. However, this article states that:
If you used the built-in PHP default session processing (which uses files), you will never run into a problem.
So now I'm at a loss where to look further.