Socket accept - "Too many open files"

I am working on a school project where I had to write a multi-threaded server, and now I compare it with apache, running some tests against it. I use autobench to help with this, but after I run some tests, or if I give too high speed (about 600+) to connect, I get the message "Too many open files."

After I finished working with the request, I always do close() on the socket. I also tried using the shutdown() function, but nothing helps. How to get around this?

+43
c sockets
May 19 '09 at 1:15
source share
10 answers

There are several places where Linux may have limits on the number of file descriptors you can open.

You can check the following:

 cat /proc/sys/fs/file-max 

This will give you the system limitations of file descriptors.

At the shell level, this will tell you your personal limit:

 ulimit -n 

This can be changed in /etc/security/limits.conf - this is the nofile parameter.

However, if you close your sockets correctly, you should not get this unless you open a lot of simulation connections. Something seems to be stopping your sockets from being closed properly. I would make sure they are handled properly.

+47
May 19 '09 at 1:20
source share

I had a similar problem. Fast decision:

 ulimit -n 4096 

the explanation is as follows: each connection to the server is a file descriptor. On CentOS, Redhat, and Fedora, perhaps others, the file restriction is 1024 β€” I don’t know why. This is easy to see when typing: ulimit -n

Note that this has little to do with system maximum files (/ proc / sys / fs / file-max).

In my case, it was a problem with Redis, so I did:

 ulimit -n 4096 redis-server -c xxxx 

in your case, instead of redis, you need to start your server.

+15
Dec 20 '11 at 10:59 a.m.
source share

TCP has a TIME_WAIT function that provides a clean connection. It requires one end of the connection to continue listening for some time after closing the connector.

On a high-performance server, it is important that the clients that log into TIME_WAIT, not the server. Clients can allow a port to open, while a busy server can quickly exit ports or have too many open FDs.

To achieve this, the server should never close the connection first - it should always wait until the client closes it.

+12
Jun 03 2018-10-06
source share

Use lsof -u `whoami` | wc -l lsof -u `whoami` | wc -l to find out how many files a user has open

+5
Jun 01 2018-11-18T00:
source share

I also had this problem. You have a file descriptor leak. You can debug this by printing a list of all open file descriptors (on POSIX systems):

 void showFDInfo() { s32 numHandles = getdtablesize(); for ( s32 i = 0; i < numHandles; i++ ) { s32 fd_flags = fcntl( i, F_GETFD ); if ( fd_flags == -1 ) continue; showFDInfo( i ); } } void showFDInfo( s32 fd ) { char buf[256]; s32 fd_flags = fcntl( fd, F_GETFD ); if ( fd_flags == -1 ) return; s32 fl_flags = fcntl( fd, F_GETFL ); if ( fl_flags == -1 ) return; char path[256]; sprintf( path, "/proc/self/fd/%d", fd ); memset( &buf[0], 0, 256 ); ssize_t s = readlink( path, &buf[0], 256 ); if ( s == -1 ) { cerr << " (" << path << "): " << "not available"; return; } cerr << fd << " (" << buf << "): "; if ( fd_flags & FD_CLOEXEC ) cerr << "cloexec "; // file status if ( fl_flags & O_APPEND ) cerr << "append "; if ( fl_flags & O_NONBLOCK ) cerr << "nonblock "; // acc mode if ( fl_flags & O_RDONLY ) cerr << "read-only "; if ( fl_flags & O_RDWR ) cerr << "read-write "; if ( fl_flags & O_WRONLY ) cerr << "write-only "; if ( fl_flags & O_DSYNC ) cerr << "dsync "; if ( fl_flags & O_RSYNC ) cerr << "rsync "; if ( fl_flags & O_SYNC ) cerr << "sync "; struct flock fl; fl.l_type = F_WRLCK; fl.l_whence = 0; fl.l_start = 0; fl.l_len = 0; fcntl( fd, F_GETLK, &fl ); if ( fl.l_type != F_UNLCK ) { if ( fl.l_type == F_WRLCK ) cerr << "write-locked"; else cerr << "read-locked"; cerr << "(pid:" << fl.l_pid << ") "; } } 

By dumping all open files, you quickly find out where the file descriptor leak is located.

If your server spawns subprocesses. For example. if it is a server of type "fork", or if you create other processes (for example, via cgi), you should definitely create your files using "cloexec" - both for real files and for sockets.

Without cloexec, every time you use fork or spawn, all open file descriptors are cloned in the child process.

It is also very easy to close network sockets - for example, simply discarding them when the remote side is disconnected. This will result in leakage of pens such as crazy.

+5
May 21 '13 at 15:42
source share

it may take some time before the closed socket is truly free.

lsof to view open files

cat /proc/sys/fs/file-max to see if a system limit exists

+4
May 19 '09 at 1:20
source share

This means the maximum number of simultaneously open files.

It is decided:

At the end of the /etc/security/limits.conf file, you need to add the following lines:

 * soft nofile 16384 * hard nofile 16384 

In the current console, from root (sudo does not work):

 ulimit -n 16384 

Although this is not necessary if you can restart the server.

In the /etc/nginx/nginx.conf file, to register a new worker_connections value of 16384 , divide by the worker_processes value.

If there was no ulimit -n 16384 , you need to reboot, then the problem will recede.

PS:

If after repair the error accept() failed (24: Too many open files) logs is visible in the logs:

In nginx propevia configuration (for example):

 worker_processes 2; worker_rlimit_nofile 16384; events { worker_connections 8192; } 
+4
Sep 21 '15 at 15:38
source share

When your program has more open descriptors than open ulimit files (ulimit -a will list this), the kernel will refuse to open more file descriptors. Make sure you don't have leaks on the file descriptor - for example, by running it for a while, then stopping and checking to see if all the additional feds are still open when it is idle - and if this is still a problem, change the nofile ulimit for your user in / etc / security / limits.conf

+1
May 19 '09 at 1:21
source share

I had the same problem, and I did not check the return values ​​of the close () calls. When I started checking the return value, the problem mysteriously disappeared.

I can only assume that the optimization in the compiler fails (gcc in my case), it is assumed that calls to close () have no side effects and can be omitted if their return values ​​are not used.

+1
May 21 '13 at 12:32
source share

Another CentOS information. In this case, using "systemctl" to start the process. You need to change the system file ==> /usr/lib/systemd/system/processName.service. In this line in the file:

 LimitNOFILE=50000 

And just reboot the conf system:

 systemctl daemon-reload 
0
Jul 10 '17 at 9:56 on
source share



All Articles