Stream large binary files from urllib2 to file

I use the following code to stream large files from the Internet to a local file:

fp = open(file, 'wb') req = urllib2.urlopen(url) for line in req: fp.write(line) fp.close() 

This works, but it loads rather slowly. Is there a faster way? (The files are large, so I do not want to keep them in memory.)

+49
python file urllib2 streaming
04 Oct '09 at 10:50
source share
4 answers

There is no reason to work line by line (small chunks And it requires Python to find line breaks for you!), Just put it in large chunks, for example:

 # from urllib2 import urlopen # Python 2 from urllib.request import urlopen # Python 3 response = urlopen(url) CHUNK = 16 * 1024 with open(file, 'wb') as f: while True: chunk = response.read(CHUNK) if not chunk: break f.write(chunk) 

Experiment a bit with the various sizes of CHUNK to find a "sweet spot" for your requirements.

+92
Oct 04 '09 at 23:42
source share

You can also use shutil :

 import shutil try: from urllib.request import urlopen # Python 3 except ImportError: from urllib2 import urlopen # Python 2 def get_large_file(url, file, length=16*1024): req = urlopen(url) with open(file, 'wb') as fp: shutil.copyfileobj(req, fp, length) 
+58
Mar 22 '11 at 20:28
source share

I used the mechanize module and its Browser.retrieve () method. In the past, it occupied a 100% processor and loaded things very slowly, but in some recent versions this error has been fixed and works very quickly.

Example:

 import mechanize browser = mechanize.Browser() browser.retrieve('http://www.kernel.org/pub/linux/kernel/v2.6/testing/linux-2.6.32-rc1.tar.bz2', 'Downloads/my-new-kernel.tar.bz2') 

The mechanism is based on urllib2, so urllib2 may also have a similar method ... but I can not find it now.

+6
04 Oct '09 at 23:07
source share

You can use urllib.retrieve () to download files:

Example:

 try: from urllib import urlretrieve # Python 2 except ImportError: from urllib.request import urlretrieve # Python 3 url = "http://www.examplesite.com/myfile" urlretrieve(url,"./local_file") 
+3
Aug 27 '14 at 16:34
source share



All Articles