Python urllib2 forces IPv4

I am running a script using python that uses urllib2 to capture data from the weather api and display it on the screen. I have a problem: when I request a server, I get the error message "There is no address associated with the host name". I can view the api output using a web browser, and I can upload the file using wget, but I need to get IPv4 to make it work. Is it possible to force IPv4 in urllib2 when using urllib2.urlopen?

+5
source share
2 answers

Not directly, no.

So what can you do?


One possibility is to explicitly resolve the host name on IPv4 yourself, and then use the IPv4 address instead of the name as the host. For instance:

host = socket.gethostbyname('example.com') page = urllib2.urlopen('http://{}/path'.format(host)) 

However, for some virtual server sites, a Host: example.com header may be required, and instead it will receive Host: 93.184.216.119 . You can get around this by overriding the header:

 host = socket.gethostbyname('example.com') request = urllib2.Request('http://{}/path'.format(host), headers = {'Host': 'example.com'}) page = urllib2.urlopen(request) 

Alternatively, you can provide your own handlers instead of the standard ones. But the standard handler is basically a wrapper around httplib.HTTPConnection , and the real problem is in HTTPConnection.connect .

So, a clean way to do this is to create your own subclass httplib.HTTPConnection , which overrides connect as follows:

 def connect(self): host = socket.gethostbyname(self.host) self.sock = socket.create_connection((host, self.post), self.timeout, self.source_address) if self._tunnel_host: self._tunnel() 

Then create your own subclass urllib2.HTTPHandler that overrides http_open to use your subclass:

 def http_open(self, req): return self.do_open(my wrapper.MyHTTPConnection, req) 

... and similarly for HTTPSHandler , and then properly connect all materials as shown in urllib2 .

A quick and dirty way to do the same is simply monkeypatch httplib.HTTPConnection.connect for the specified function.


Finally, you can use a different library instead of urllib2 . From what I remember, requests does not make it any easier (in the end, you have to override or disarm slightly different methods, but this is actually the same). However, any libcurl wrapper will allow you to execute the equivalent of curl_easy_setopt(h, CURLOPT_IPRESOLVE, CURLOPT_IPRESOLVE_V4) .

+12
source

Not the right answer, but an alternative: call curl ?

 import subprocess import sys def log_error(msg): sys.stderr.write(msg + '\n') def curl(url): process = subprocess.Popen( ["curl", "-fsSkL4", url], stdout=subprocess.PIPE, stderr=subprocess.PIPE, ) stdout, stderr = process.communicate() if process.returncode == 0: return stdout else: log_error("Failed to fetch: %s" % url) log_error(stderr) exit(3) 
0
source

Source: https://habr.com/ru/post/1494864/


All Articles