Linux sockets: local null copy, remote TCP / IP

Networking is my worst area in operating systems, so forgive me for asking a possibly incomplete question. I read about it for several hours, but it floated a little in my head. (For me, I feel the chip design is simple compared to figuring out network protocols.)

I have network services that communicate with each other via sockets. In particular, sockets are created using fd = socket(PF_INET, SOCK_STREAM, 0); which automatically receives TCP / IP. I need this as a basic option, because these services can run on different machines.

But for one project, we are trying to squeeze all of them into an inoperative built-in β€œdevice” based on the Atom Z530P, so it seems to me that the memory overhead is something we could optimize. I read about it here: data-link-access-and-zero-copy and Linux_packet_mmap and packet_mmap .

In this case, you can create a socket something like this: fd = socket(PF_PACKET, PF_RAW, 0); . And there are many other things there, such as allocating ring buffers, mmapping them, binding them to a socket, etc. It looks like you are limited to using sendto and recvfrom to transfer data. As far as I understand, since the socket is local, you do not need a reliable stream-type socket, so raw sockets are the appropriate interface, and I assume that the ring buffer is used to drill down to the page where each packet (or datagram) starts from the page border .

Before spending a huge amount of time studying this further, I was hoping that some helpful people could help me with some questions:

  • How much can I expect to benefit from zero copy sockets? I think the last thing I checked, we moved a maximum of 40 MB / s from one process to another and finally to disk. In the most basic scenario, the data is moved from the capture process, to the one-to-many process (others can listen to the stream), to the archiving process, which writes to disk. These are two jumps, not counting the drive and internal things.
  • Does Linux do this automatically, optimizing processes running on the same machine?
  • In any case, I would listen on sockets on TCP ports. Can I use them to establish connections between processes, but still be able to use a null copy? In other words, can I use AF_INET with PF_PACKET?
  • Is PF_PACKET with SOCK_RAW the only valid configuration for zero copy sockets?
  • Is there a good sample code that will use a null copy with TCP / IP as a backup?
  • What is the easiest or best way to discover that two processes are on the same machine? They know each other by IP addresses, so I can just compare and use different code paths for each. Is there an easier way to do this?
  • Can I use write () and read () in a packet socket, or are they only applicable to streams? (Rewriting how connections are made would be easier than rewriting ALL socket code.)
  • Am I complicating things too much and / or optimizing the wrong thing? OProfiler tells me that most CPU clocks are consumed in two places: (1) zlib and (2) the kernel, which I cannot profile, since I use CentOS 6.2, which vmlinux does not provide. I assume that kernel time is a combination of downtime and data copying and nothing more.

Thanks in advance for your help!

+4
source share
2 answers

Am I complicating things too much and / or optimizing the wrong thing?

Maybe. The use of PF_PACKET sockets is for specialized materials only. You probably want to explore

What is the easiest or best way to discover that these two processes are on the same machine?

Just do not forget this information.

Does Linux do this automatically, optimizing processes running on the same machine?

No, you have to do it yourself.

+7
source

I think the choice between TCP / IP and raw packets is much more important than the zero copy question. If you need reliable streaming, you need TCP / IP (i.e. AF_INET + PF_STREAM). Trying to implement a reliable stream over untrusted packages is very difficult, and it is already done for you.

The best way to use TCP / IP with zero copy and files is, as @cnicutar says, sendfile (2) and splice (2). I think there is a way to enjoy a zero copy without them (if you want to read data in memory, not directly to a file), but I'm not sure how to do it.

Centos is also open source, so you can get the vmlinux file by downloading the source code and compiling it.

+2
source

Source: https://habr.com/ru/post/1388375/


All Articles