How to fix possible problems with lost packets using .NET and TCP sockets?

I need help figuring out how to fix the problem I'm observing with a large data channel over TCP using .NET sockets.

In short, when a client application starts, it connects to a specific port on the server. After connecting, the server starts sending data in real time to the client, which displays information in a user interface similar to a ticker. The server supports several client workstations, so data will be sent through several ports (several sockets).

Everything is implemented and works great with slow feed and low volume. I experience stress testing of the system to ensure stability and scalability. When I increase the frequency, the server is working fine. However, I see what seems like a lost package on clients. This happens at random times.

Currently, each transmitted message is preceded by a 4-byte value that determines the length of this message. When we receive data in the client, we add data to the buffer (stream) until we get so many bytes. Any additional bytes are considered the beginning of the next message. Again, this works fine until I increase the frequency.

In my test, I am sending a packet of about 225 bytes in size, followed by about 310kB and about 40kb more. Sending a message every 1 second works without fail with the launch of about 12 clients. Increasing the frequency to 1/2 second, I finally saw that one of the client displays freezes. The transition is 1/4 second and I can reproduce the problem with 4 clients in a few seconds.

Looking at my code (which I can provide, if necessary), I see that all clients receive data, but somehow the information fell "out of sync", and the expected length value is huge (in the 100 millionth range). As a result, we just keep reading the data and never perceive the end of the message.

I need either a better approach or a way to ensure that I get the expected data and do not lose packets. You can help?

UPDATE

I spent a ton of additional testing changing message size and delivery frequency. There is a certain correlation. The smaller the message sizes, the higher the frequency I can reach. But, inevitably, I can always break it.

So, to more accurately describe what I'm looking for, this is:

  • To understand what is happening. This will help me identify a possible solution, or at least set thresholds for reliable behavior.

  • Introduce a fail-safe mechanism, so when a problem occurs, I can handle this and possibly restore it. Perhaps adding a checksum to the data stream or something like that.

Here is the code that I run in client (receiving) applications:

public void StartListening(SocketAsyncEventArgs e) { e.Completed += SocketReceive; socket.ReceiveAsync(e); } private void SocketReceive(Object sender, SocketAsyncEventArgs e) { lock (_receiveLock) { ProcessData(e.Buffer, e.BytesTransferred); socket.ReceiveAsync(e); } } private void ProcessData(Byte[] bytes, Int32 count) { if (_currentBuffer == null) _currentBuffer = new ReceiveBuffer(); var numberOfBytesRead = _currentBuffer.Write(bytes, count); if (_currentBuffer.IsComplete) { // Notify the client that a message has been received (ignore zero-length, "keep alive", messages) if (_currentBuffer.DataLength > 0) NotifyMessageReceived(_currentBuffer); _currentBuffer = null; // If there are bytes remaining from the original message, recursively process var numberOfBytesRemaining = count - numberOfBytesRead; if (numberOfBytesRemaining > 0) { var remainingBytes = new Byte[numberOfBytesRemaining]; var offset = bytes.Length - numberOfBytesRemaining; Array.Copy(bytes, offset, remainingBytes, 0, numberOfBytesRemaining); ProcessData(remainingBytes, numberOfBytesRemaining); } } } internal sealed class ReceiveBuffer { public const Int32 LengthBufferSize = sizeof(Int32); private MemoryStream _dataBuffer = new MemoryStream(); private MemoryStream _lengthBuffer = new MemoryStream(); public Int32 DataLength { get; private set; } public Boolean IsComplete { get { return (RemainingDataBytesToWrite == 0); } } private Int32 RemainingDataBytesToWrite { get { if (DataLength > 0) return (DataLength - (Int32)_dataBuffer.Length); return 0; } } private Int32 RemainingLengthBytesToWrite { get { return (LengthBufferSize - (Int32)_lengthBuffer.Length); } } public Int32 Write(Byte[] bytes, Int32 count) { var numberOfLengthBytesToWrite = Math.Min(RemainingLengthBytesToWrite, count); if (numberOfLengthBytesToWrite > 0) WriteToLengthBuffer(bytes, numberOfLengthBytesToWrite); var remainingCount = count - numberOfLengthBytesToWrite; // If this value is > 0, then we have still have more bytes after setting the length so write them to the data buffer var numberOfDataBytesToWrite = Math.Min(RemainingDataBytesToWrite, remainingCount); if (numberOfDataBytesToWrite > 0) _dataBuffer.Write(bytes, numberOfLengthBytesToWrite, numberOfDataBytesToWrite); return numberOfLengthBytesToWrite + numberOfDataBytesToWrite; } private void WriteToLengthBuffer(Byte[] bytes, Int32 count) { _lengthBuffer.Write(bytes, 0, count); if (RemainingLengthBytesToWrite == 0) { var length = BitConverter.ToInt32(_lengthBuffer.ToArray(), 0); DataLength = length; } } } 
+4
source share
2 answers

Without seeing our code, we can only guess. I assume you are considering a case where you read less than the full 4-byte header? You can read only one, two or three bytes. Increasing the amount of data will lead to the fact that this will happen more often.

Since TCP is a reliable protocol, this is not due to packet loss. Any lost packets result in one of two things:

  • Missing data is retransmitted and the recipient experiences a short pause, but never sees the missing data or the data is out of order.
  • The socket is closed.

UPDATE

Your IsComplete method returns true after the partial length has been written to the buffer. This causes your receiver code in ProcessData() discard the length bytes already received, and then go out of sync.

+6
source

I don't know if you've ever heard of network congestion , this seems to be at least part of your problem. If you look at the number of calls you make when data arrives (read: your ProcessData method). This blocks everything else on your server as long as it has control and even works recursively.

This means that the more data you have to process, the longer this method returns. In between you cannot process other incoming messages. Thus, the buffer of your local network card is full, any router includes buffer packets for you, and so on. Packets are deleted and sent again, and your network is blocked if you cannot read faster. This is called the aforementioned network congestion.

Another thing that jumped right in my code is blocking. Why are you blocking something when working asynchronously? You must understand that your SAEA object is a state object for the thread that processes your async method call. It is intended to remove pain from the thread itself. You basically call the ReceiveAsync socket receive method and drop the SAEA object on it, and if processing is complete, you take the buffer and information from it and return it to the ReceiveAsync method again. Processing is performed in another method that has nothing to do with reading from a socket. Thus, you quickly save your outlet.

And the last one: I don’t know what the purpose of your application is, but it is usually not recommended to use TCP for large loads of data that arrive at high speeds. This is why faster network game engines use UDP. Even slower than these engines typically send 20 packets per second. If your code breaks into half a second, you should switch to UDP. I found Gaffer on Games to be a good source of information about real-time networks with UDP. As he explains the concepts, he may be useful to you.

0
source

Source: https://habr.com/ru/post/1391929/


All Articles