I use the following python code to load web pages from servers with gzip compression:
url = "http://www.v-gn.de/wbb/"
import urllib2
request = urllib2.Request(url)
request.add_header('Accept-encoding', 'gzip')
response = urllib2.urlopen(request)
content = response.read()
response.close()
import gzip
from StringIO import StringIO
html = gzip.GzipFile(fileobj=StringIO(content)).read()
This works in general, but for the specified URL with an error struct.error. I get a similar result if I use wget with the title "Accept-encoding". However, browsers seem to be able to unpack the response.
So my question is: is there a way to get my python code to unpack the HTTP response without resorting to disabling compression by removing the "Accept-encoding" header?
For completeness, here is the line I use for wget:
wget --user-agent="Mozilla" --header="Accept-Encoding: gzip,deflate" http://www.v-gn.de/wbb/
source
share