Python UDP socket does not receive semi-randomly

I have a problem with something, and I assume this is code.

The app is used to ping some custom network devices to check if they are alive. He pings them every 20 seconds with a special UDP packet and waits for a response. If they cannot respond to 3 consecutive pings, the application sends a warning message to the staff.

The application works around the clock and for a random number of times a day (mainly 2-5), the application cannot receive UDP packets for the exact time of 10 minutes, after which everything returns to its normal state. During these 10 minutes, only 1 device seems to be responding, others seem to be dead. That I could deduce from magazines.

I used wirehark to sniff the packets, and I checked that the ping packets both go AND inside, so the network part seems to work fine, right down to the OS. WinXPPro runs on computers, and some do not have a firewall configured. I have this problem on different computers, different Windows installations and different networks.

I really don't understand what could be the problem here.

I am attaching the corresponding piece of code that makes the whole network. This runs in a separate thread from the rest of the application.

I thank you for what you could imagine.

def monitor(self): checkTimer = time() while self.running: read, write, error = select.select([self.commSocket],[self.commSocket],[],0) if self.commSocket in read: try: data, addr = self.commSocket.recvfrom(1024) self.processInput(data, addr) except: pass if time() - checkTimer > 20: # every 20 seconds checkTimer = time() if self.commSocket in write: for rtc in self.rtcList: try: addr = (rtc, 7) # port 7 is the echo port self.commSocket.sendto('ping',addr) if not self.rtcCheckins[rtc][0]: # if last check was a failure self.rtcCheckins[rtc][1] += 1 # incr failure count self.rtcCheckins[rtc][0] = False # setting last check to failure except: pass for rtc in self.rtcList: if self.rtcCheckins[rtc][1] > 2: # didn't answer for a whole minute self.rtcCheckins[rtc][1] = 0 self.sendError(rtc) 
+6
source share
2 answers

You did not mention this, so I must remind you that since you are using select() , this socket is better not to block. Otherwise, your recvfrom() may be blocked. In fact, this should not happen when things are right, but it's hard to say from a short piece of code.

Then you do not need to check the UDP socket for writeability - it is always writable.

Now for the real problem - you say that the packages are included in the system, but your code does not receive them. This is most likely due to overflow in the socket receive buffer. Has the number of ping goals increased over the past 15 years? You are setting yourself up for a storm storm, and you probably aren’t reading these answers fast enough, so they accumulate in the receive buffer and eventually get lost.

My suggestions in ROI order:

  • Distribute ping requests, do not configure yourself for DDOS. Request, say, one system per iteration, and save the last time the target checks. This will allow you to even out the number of packets and execute them.
  • Increase SO_RCVBUF to a large value. This will allow your firewall to handle packet packets better.
  • Read the packages in a loop, i.e. as soon as your UDP socket is readable (assuming it is not blocking), read until you get EWOULDBLOCK . This will save you a ton of select() calls.
  • See if you can use any of the advanced Windows APIs according to Linux recvmmsg(2) , if such a thing exists, to remove multiple packages for each system call.

Hope this helps.

+3
source

UDP does not guarantee reliable transmission. It may work now, next hour and next year. Then after two years he will not be able to contact for an hour.

The packet route path may be blocked in some situations. When this happens with TCP, the sender is informed of the loss, and the sender may try to send it along a different route route. Since UDP is a "send and forget protocol", you can statistically lose some of your packets.

tl; dr Use TCP.

0
source

Source: https://habr.com/ru/post/920781/


All Articles