Now I'm learning how to retrieve data from a site as quickly as possible. To get a higher speed, I am considering using multi-threaded threads. Here is the code I used to check the difference between a multi-threaded and a simple message.
import threading import time import urllib import urllib2 class Post: def __init__(self, website, data, mode): self.website = website self.data = data #mode is either "Simple"(Simple POST) or "Multiple"(Multi-thread POST) self.mode = mode def post(self): #post data req = urllib2.Request(self.website) open_url = urllib2.urlopen(req, self.data) if self.mode == "Multiple": time.sleep(0.001) #read HTMLData HTMLData = open_url.read() print "OK" if __name__ == "__main__": current_post = Post("http://forum.xda-developers.com/login.php", "vb_login_username=test&vb_login_password&securitytoken=guest&do=login", \ "Simple") #save the time before post data origin_time = time.time() if(current_post.mode == "Multiple"): #multithreading POST for i in range(0, 10): thread = threading.Thread(target = current_post.post) thread.start() thread.join() #calculate the time interval time_interval = time.time() - origin_time print time_interval if(current_post.mode == "Simple"): #simple POST for i in range(0, 10): current_post.post() #calculate the time interval time_interval = time.time() - origin_time print time_interval
as you can see, this is a very simple code. first I set the mode to โSimpleโ and I can get the time interval: 50 seconds (maybe my speed is a little slow :(), then I set the mode to โSeveralโ and I get the time interval: 35 , from which I can see multithreading can actually increase the speed, but the result is not as good as I imagine, I want to get a much higher speed.
From debugging, I found that the program is basically blocked in the line: open_url = urllib2.urlopen(req, self.data)
, this line of code takes a long time to send and receive data from the specified website. maybe I can get a faster speed by adding time.sleep()
and using multithreading inside the urlopen
function, but I cannot do this because its own python function.
if you do not take into account the permissible limits that the server blocks the message speed, what else can I do to get a faster speed? or any other code that I can change? thanks a lot!
source share