Recv first few bytes from socket to determine buffer size

Question

Recv first few bytes from socket to determine buffer size

I am writing a distributed system in c++ using TCP/IP and sockets .

For each of my messages, I need to get the first 5 bytes to find out the entire length of the incoming message.

What is the best way to do this?

recv() only 5 bytes, then recv() again. if I choose this, would it be safe to assume that I get 0 or 5 bytes in recv (otherwise I don't write a loop to continue trying)?
use MSG_PEEK
recv() slightly larger buffer size, then read the first 5 bytes and then allocate the last buffer.

+4

c ++ sockets network-programming

Murph Sep 13 '12 at 14:35

source share

3 answers

Use a state machine with two states:

First state.

Receive bytes as they arrive in the buffer. When there are 5 or more bytes, check the first 5 bytes and possibly process the rest of the buffer. Switch to the second state.

The second state.

Receive and process bytes as they arrive at the end of the message.

+1

quamrana Sep 13 '12 at 14:50

source share

To answer your question:

It is unsafe to assume that you get 0 or 5. You can also get 1-4. until you get 5 or an error as others suggested.
I would not worry about PEEK, most of the time you block (assuming call blocking) or get 5, so skip the extra call on the stack.
This is also good, but adds complexity to a small gain.

0

mark Sep 13 '12 at 17:00

source share

Kerrek SB · Accepted Answer · 2012-09-13T14:49:21+0000

You do not need to know anything. TCP is a stream protocol, and at any time you can get just one byte, or as much as a few megabytes of data. The correct and only way to use a TCP socket is to read in a loop.

 char buf[4096]; // or whatever std::deque<char> data; for (int res ; ; ) { res = recv(fd, buf, sizeof buf, MSG_DONTWAIT); if (res == -1) { if (errno == EAGAIN || errno == EWOULDBLOCK) { break; // done reading } else { // error, break, die } } if (res == 0) { // socket closed, finalise, break } else { data.insert(data.end(), buf, buf + res); } }

The only purpose of the loop is to transfer data from the socket buffer to your application. Then your application must decide for itself whether there is enough data in the queue to try to extract some message from the application of a higher level.

For example, in your case, you would check if the queue size is at least 5, then check the first five bytes, and then check if the queue contains the complete application message. If not, you interrupt, and if so, you retrieve the entire message and drop out if it is turned off from the front of the queue.

Recv first few bytes from socket to determine buffer size

More articles: