Is there a way to upload many files using python? This code is fast enough to download about 100 or so files. But I need to upload 300,000 files. Obviously, they are all very small files (or I wouldn’t download 300,000 of them :)), so this loop seems to be the real bottleneck. Anyone have any thoughts? Maybe use MPI or streams?
Do I just need to live with a bottleneck? Or is there a faster way, maybe not even using python?
(I included the full start of the code just for completeness)
from __future__ import division
import pandas as pd
import numpy as np
import urllib2
import os
import linecache
data= pd.read_csv("edgar.csv")
datatemp2=data[data['form'].str.contains("14A")]
datatemp3=data[data['form'].str.contains("14C")]
data2=datatemp2.append(datatemp3)
flist=np.array(data2['filename'])
print len(flist)
print flist
original=os.getcwd().copy()
os.chdir(str(os.getcwd())+str('/edgar14A14C'))
for i in xrange(len(flist)):
url = "ftp://ftp.sec.gov/"+str(flist[i])
file_name = str(url.split('/')[-1])
u = urllib2.urlopen(url)
f = open(file_name, 'wb')
f.write(u.read())
f.close()
print i
source
share