I worked with Tornado quite a bit, but this is the first time I have encountered such an error. I worked on a very simple URL shortener. URLs are put into the database using another application that simply reads the URLs from the MongoDB store and redirects clients. After I wrote the basic code, I installed the simple Siege test against it about 30 seconds after the siege (by running siege -c 64 -t 5m -r 1 http://example.com/MKy against 4 applications) I started getting 500 responses. Looking at the error log, I saw this:
ERROR:root:500 GET /MKy (127.0.0.1) 2.05ms ERROR:root:Exception in I/O handler for fd 4 Traceback (most recent call last): File "/opt/python2.7/lib/python2.7/site-packages/tornado-2.1-py2.7.egg/tornado/ioloop.py", line 309, in start File "/opt/python2.7/lib/python2.7/site-packages/tornado-2.1-py2.7.egg/tornado/netutil.py", line 314, in accept_handler File "/opt/python2.7/lib/python2.7/socket.py", line 200, in accept error: [Errno 24] Too many open files ERROR:root:Uncaught exception GET /MKy (127.0.0.1) HTTPRequest(protocol='http', host='shortener', method='GET', uri='/MKy', version='HTTP/1.0', remote_ip='127.0.0.1', body='', headers={'Host': 'shortener', 'Accept-Encoding': 'gzip', 'X-Real-Ip': '94.23.155.32', 'X-Forwarded-For': '94.23.155.32', 'Connection': 'close', 'Accept': '*/*', 'User-Agent': 'JoeDog/1.00 [en] (X11; I; Siege 2.66)'}) Traceback (most recent call last): File "/opt/python2.7/lib/python2.7/site-packages/tornado-2.1-py2.7.egg/tornado/web.py", line 1040, in wrapper File "main.py", line 58, in get File "main.py", line 21, in dbmongo File "/opt/python2.7/lib/python2.7/site-packages/apymongo-0.0.1-py2.7-linux-x86_64.egg/apymongo/connection.py", line 349, in __init__ File "/opt/python2.7/lib/python2.7/site-packages/apymongo-0.0.1-py2.7-linux-x86_64.egg/apymongo/connection.py", line 510, in __find_master File "/opt/python2.7/lib/python2.7/site-packages/apymongo-0.0.1-py2.7-linux-x86_64.egg/apymongo/connection.py", line 516, in __try_node File "/opt/python2.7/lib/python2.7/site-packages/apymongo-0.0.1-py2.7-linux-x86_64.egg/apymongo/database.py", line 301, in command File "/opt/python2.7/lib/python2.7/site-packages/apymongo-0.0.1-py2.7-linux-x86_64.egg/apymongo/collection.py", line 441, in find_one File "/opt/python2.7/lib/python2.7/site-packages/apymongo-0.0.1-py2.7-linux-x86_64.egg/apymongo/cursor.py", line 539, in loop File "/opt/python2.7/lib/python2.7/site-packages/apymongo-0.0.1-py2.7-linux-x86_64.egg/apymongo/cursor.py", line 560, in _refresh File "/opt/python2.7/lib/python2.7/site-packages/apymongo-0.0.1-py2.7-linux-x86_64.egg/apymongo/cursor.py", line 620, in __send_message File "/opt/python2.7/lib/python2.7/site-packages/apymongo-0.0.1-py2.7-linux-x86_64.egg/apymongo/connection.py", line 735, in _send_message_with_response File "/opt/python2.7/lib/python2.7/site-packages/apymongo-0.0.1-py2.7-linux-x86_64.egg/apymongo/connection.py", line 591, in __stream File "/opt/python2.7/lib/python2.7/site-packages/apymongo-0.0.1-py2.7-linux-x86_64.egg/apymongo/connection.py", line 200, in get_stream File "/opt/python2.7/lib/python2.7/site-packages/apymongo-0.0.1-py2.7-linux-x86_64.egg/apymongo/connection.py", line 559, in __connect AutoReconnect: could not connect to [('127.0.0.1', 27017)]
Important (I think);
error: [Errno 24] Too many open files
Code; (It is very simple)
import tornado.ioloop import tornado.web import tornado.escape import apymongo import time import sys
The dev server I use has 8 cores and 64 GB of memory, runs RedHat Enterprise Linux 5 and Python 2.6. I have never had such problems with Tornado / Async Mongo applications before.
Probably useful information;
[ root@puma ~]
(open files are only set to 1024, but I would think that more than enough)
Does Tornado / Apymongo not close connections correctly? Applications sit behind NGINX, but connect using HTTP, Apymongo must connect via TCP, but can use sockets. Even if it should be sharing / joining connections, right?
Edit
As suggested, we moved the application to one of our test servers with a maximum open file size of 61440, the same thing happened after about 30 seconds of work under siege.