How to fix ValueError value: read closed file exception?

This simple Python 3 script:

import urllib.request host = "scholar.google.com" link = "/scholar.bib?q=info:K7uZdMSvdQ0J:scholar.google.com/&output=citation&hl=en&as_sdt=1,14&ct=citation&cd=0" url = "http://" + host + link filename = "cite0.bib" print(url) urllib.request.urlretrieve(url, filename) 

throws this exception:

 Traceback (most recent call last): File "C:\Users\ricardo\Desktop\Google-Scholar\BibTex\test2.py", line 8, in <module> urllib.request.urlretrieve(url, filename) File "C:\Python32\lib\urllib\request.py", line 150, in urlretrieve return _urlopener.retrieve(url, filename, reporthook, data) File "C:\Python32\lib\urllib\request.py", line 1597, in retrieve block = fp.read(bs) ValueError: read of closed file 

I thought this was a temporary issue, so I added some simple exception handling:

 import random import time import urllib.request host = "scholar.google.com" link = "/scholar.bib?q=info:K7uZdMSvdQ0J:scholar.google.com/&output=citation&hl=en&as_sdt=1,14&ct=citation&cd=0" url = "http://" + host + link filename = "cite0.bib" print(url) while True: try: print("Downloading...") time.sleep(random.randint(0, 5)) urllib.request.urlretrieve(url, filename) break except ValueError: pass 

but it just prints Downloading... ad infinitum.

+6
source share
1 answer

Your URL returns a 403 code error and apparently urllib.request.urlretrieve does not detect all HTTP errors very well because it uses urllib.request.FancyURLopener and this last attempt to urlinfo error by returning urlinfo instead of raising the error.

On the fix, if you still want to use urlretrieve , you can override FancyURLopener like this (code is also included to show the error):

 import urllib.request from urllib.request import FancyURLopener class FixFancyURLOpener(FancyURLopener): def http_error_default(self, url, fp, errcode, errmsg, headers): if errcode == 403: raise ValueError("403") return super(FixFancyURLOpener, self).http_error_default( url, fp, errcode, errmsg, headers ) # Monkey Patch urllib.request.FancyURLopener = FixFancyURLOpener url = "http://scholar.google.com/scholar.bib?q=info:K7uZdMSvdQ0J:scholar.google.com/&output=citation&hl=en&as_sdt=1,14&ct=citation&cd=0" urllib.request.urlretrieve(url, "cite0.bib") 

Else, and this is what I recommend , you can use urllib.request.urlopen like this:

 fp = urllib.request.urlopen('http://scholar.google.com/scholar.bib?q=info:K7uZdMSvdQ0J:scholar.google.com/&output=citation&hl=en&as_sdt=1,14&ct=citation&cd=0') with open("citi0.bib", "w") as fo: fo.write(fp.read()) 
+4
source

Source: https://habr.com/ru/post/920750/


All Articles