How to disable http post timeout using urlopen from urllib2 in Python?

Overview

I am using urlopen from Python 2.7.1 urllib2 to create an HTTP POST on a computer running Windows XP to a remote Apache web server (e.g. Mac OS X Embedded Web Sharing). The data sent contains some identifier, data and a checksum; if all the data is sent, the server responds with a confirmation. You can use the checksum in the data to check if everything is in order.

Problem

This usually works fine, but sometimes the Internet connection is bad, often because the client sending the data uses a Wi-Fi or 3G connection. This leads to loss of Internet connection for some arbitrary time. urlopen contains a timeout parameter to make sure that this does not block your program and that it can continue.

This is what I want, but the problem is that urlopen does not stop the socket from continuing to send any data that it still had to send when the timeout occurred. I tested this (with the code that I will show below), trying to send a large bit of data to my laptop, I would see network activity in both shows, I would stop the wireless connection on the laptop, wait until the function crashes, and then repeatedly activates the wireless connection and the data transfer will continue, but the program will no longer listen to the answers. I even tried to exit the Python interpreter, and it will still send data, so Windows will somehow control this.

Causes

The timeout (as I understand it) works as follows: It checks the "standby time"
( [Python-Dev] Adding a socket timeout to urllib2 )
If you set the timeout to 3, it will open the connection, start the counter, and then try to send data and wait for a response if at some point a timer is called before receiving a response due to a timeout exception. Please note that sending data does not appear to be considered an โ€œactivityโ€ in the distant timeout timer.
( urllib2 times, but does not close the socket connection )
( Close urllib2 connection )

Apparently, somewhere it is indicated that when a socket is closed / dereferenced / garbage, it calls its "close" function, which expects that all data will be sent before the socket is closed. However, there is also a shutdown function that should terminate the socket immediately, preventing any data from being sent.
( socket.shutdown vs socket.close )
( http://docs.python.org/library/socket.html#socket.socket.close )

What I want

I want the connection to be โ€œturned offโ€ when a timeout occurs. Otherwise, my client will not be able to determine whether the data is received correctly or not, and try to send it again. I would rather just kill the connection and try again later, knowing that the data was (possibly) not sent successfully (the server can recognize this if the checksum does not match).

Here is the piece of code that I used to verify this. Try it ... besides the details it still doesn't work as I expected, any help there is also appreciated. As I said, I want the program to turn off the socket as soon as a timeout exception (or any other) has occurred.

from urllib import urlencode from urllib2 import urlopen, HTTPError, URLError import socket import sys class Uploader: def __init__(self): self.URL = "http://.../" self.data = urlencode({'fakerange':range(0,2000000,1)}) print "Data Generated" def upload(self): try: f = urlopen(self.URL, self.data, timeout=10) returncode = f.read() except (URLError, HTTPError), msg: returncode = str(msg) except socket.error: returncode = "Socket Timeout!" else: returncode = 'Im here' def main(): upobj = Uploader() returncode = upobj.upload() if returncode == '100': print "Success!" else: print "Maybe a Fail" print returncode print "The End" if __name__ == '__main__': main() 
+6
source share
5 answers

It turns out that calling the .sock.shutdown (socket.SHUT_RDWR) and .close () commands on the HTTPConnection that is loading does not stop the download. It will continue to run in the background. I don't know more reliable / direct methods to kill a connection to Python using urllib2 or httplib.
In the end, we tested the boot using urllib2 without a timeout. This means that a slow connection may take a very long time to download (POST), but at least we will know how it works or not. There is a possibility that urlopen may freeze because there is no timeout, but we tested various possibilities of poor communication, and in all cases urlopen either worked or returned an error after some time.
This means that at least we will know on the client side that the download was successful or failed, and that it does not continue in the background.

0
source

I found code that can help you in this thread :

 from urllib2 import urlopen from threading import Timer url = "http://www.python.org" def handler(fh): fh.close() fh = urlopen(url) t = Timer(20.0, handler,[fh]) t.start() data = fh.read() t.cancel() 
+1
source

You can use a different API than urllib2. httplib is a little less pleasant, but often not so bad. However, it allows you to access the main socket object. So you can do something like:

 import httplib import socket def upload(host, path, data): conn = httplib.HTTPConnection(host, 80, True, 3) try: conn.request('POST', path, data) response = conn.getresponse() if response.status != 200: # maybe an HTTP error return response.status else: response_data = r.read() return response_data except socket.error: return "Socket Timeout!" finally: conn.sock.shutdown() conn.close() def main(): data = urlencode({'fakerange':range(0,2000000,1)}) returncode = upload("www.server.com", "/path/to/endpoint", data) ... 

(Disclaimer: Unverified)

httplib has various limitations compared to urllib2 - it will not automatically handle things like redirects, for example. However, if you use this to access a relatively fixed API, and not to download random things from the Internet, it should do the job well.

Honestly, I probably would not do it myself; I am generally satisfied that the operating system deals with TCP buffers, but it wants to, even if its approach is not always completely optimal ...

+1
source

If calling socket.shutdown really the only way to trim data on a timeout, I think you need to resort to some kind of alteration of the monkeys. urllib2 doesn't really give you the ability to do this kind of fine-grained nest management.

Check out the Source Interface with Python and urllib2 for a good approach.

0
source

You can create an additional stream using multiprocessing and then disable it when you find a timeout ( URLError exception with the error message "urlopen error timed out").

Stopping the process should be sufficient to close the socket.

0
source

Source: https://habr.com/ru/post/900942/


All Articles