Choosing a design for working with high-performance files

I am developing a Linux application that should support about 250 connections and transfer large files over TCP sockets in the range of 100 MB +. The goal is to tune throughput, not latency. I want to constantly support saturated 2x1Gbit ethernet connections. They will be connected by channel.

It was expected that the application would be busy continuously and simply throw out the data as quickly as possible. Connections will remain in most cases, so unlike HTTP, they will not be so broken.

I considered various options like epoll, sendfile api, etc. for high performance and aio (which looks too immature and risky IMHO).

I also looked at boost asio api which uses epoll below. I have used it before, but not for a high-performance application like this.

I have more than 4 processor cores, so I can use this.

However, I read that raising asio is not very good with multiple threads due to some blockage in reactor design. Perhaps this is a problem for me?

If I have many processor cores available, should I just create so many threads or forked processes and configure them on each processor core?

How about blocking etc. I need some design suggestions. I suspect that my main bottleneck will be disk I / O, but nonetheless ... I want a good design to come in front with a lot of rework later.

Any suggestions?

+3
4

Linux, 250 TCP 100 +. , , . 2x1Gbit ethernet. .

IO , . 250 - .

. , - , . , : sendfile() .

SSD , .

, . , HTTP .

" " - . , , ​​ , - .

, (, 4) , read() sendfile() , IO. , , IO .

. : , /, . , NIC/ .

, epoll, sendfile api .. aio ( IMHO).

FTP sendfile(). Oracle AIO, Linux - .

boost asio api, epoll . , , .

IIRC, . IMO , , .

4 , .

TCP , IO . , IO.

, , asio - . , ?

libevent. , , , sendfile(). , .

, ?

. . ( ?) , IO, .

. read()s == , .

. read()s == no IO . , , ( ).

.. . , Disk I/O, ...

, ( ). , , .

SSD , (, ) . - , IO , IO , .

, poll() ( boost.asio libevent) . , . , , . POLLOUT, , . , : , , , . , .

, .

... ......

+9

sendfile() - , . epoll() - , . 250 , select() poll(), , .

+2

, - -. , . , , , , ; .

, .

+1

, , , , .

, . , - http, rsync .. Rsync , .

250 - , 1000 .

Depending on whether the files fit in ram, and how fast your IO server is, you can become a network bottleneck. If your network is only 1-2 Gbps, it seems that your storage can outperform it on serial IO, so the network will become a bottleneck.

0
source

Source: https://habr.com/ru/post/1754711/


All Articles