When would python get stuck in time.sleep function?

I am currently using selenium in python to do something that requires an infinite end loop to observe what I want, here is a code snippet:

 records = set() fileHandle = open('d:/seizeFloorRec.txt', 'a') fileHandle.write('\ncur time: '+time.strftime('%Y-%m-%d %H:%M:%S',time.localtime(time.time()))+'\n') driver = webdriver.Chrome() while(True): try: print "time: ", time.strftime('%Y-%m-%d %H:%M:%S',time.localtime(time.time())) subUrls = aMethod(driver) # a irrelevant function which returns a list time.sleep(2) for i in range(0, len(subUrls)): print "cur_idx=["+str(i)+"], max_cnt=["+str(len(subUrls))+"]" try: rtn = monitorFloorKeyword(subUrls[i]) time.sleep(1.5) if(rtn[0] == True): if(rtn[1] not in records): print "hit!" records.add(rtn[1]) fileHandle.write(rtn[1]+'\t'+rtn[2].encode('utf-8')+'\n') fileHandle.flush() else: print "hit but not write." except Exception as e: print "exception when get page: ", subUrls[i] print e.__doc__ continue print "sleep 5*60 sec..." time.sleep(300) # PROBLEM LIES HERE!!! print "sleep completes." except Exception as e: print 'exception!' print e.__doc__ time.sleep(20) 

it always sets unpredictably in time.sleep(300) , and the output is "sleep 5 * 60 sec ..." yet without "ending the sleep."

Can someone give me some probable cause of this phenomenon? Many thanks!

UPDATED

I found a similar problem here , but I really don't understand what he wants to say. Hope this will contribute to my problem.

LAST TEST

Using chromedriver I added driver.get("about:blank") right before each return line in each function, as shown below, to force the page to stop asynchronously loading on the current page. and this forced stop operation causes ERROR ipc_channel_win.cc (370)] channel error: 109 , which does NOT affect the operation of my program. time.sleep this affect my time.sleep function?

 def retrieveCurHomePageAllSubjectUrls(driver): uri = "http://www.example.com/main.php?page=1" driver.get(uri) element = driver.find_elements_by_class_name('subject') subUrls = [] for i in range(0, len(element)): subUrls.append(element[i].get_attribute('href').encode('utf-8')) driver.get("about:blank") #This is what I add return subUrls def monitorFloorKeyword(subUrl): driver.get(subUrl) title = driver.find_element_by_id('subject_tpc').text content = driver.find_element_by_id('read_tpc').text if(title.find(u'keyword') >= 0 or content.find(u'keyword') >= 0): driver.get("about:blank") #This is what I add return (True,subUrl,title,content) driver.get("about:blank") #This is what I add return (False,) 

SEEM TO THE END

As I said above, there is a channel error right after I have driver.get("about:blank") , however, the good news is that this time everything works fine. If someone knows something about selenium that is relevant to this post, please let me know if I would.

+1
source share
2 answers

I took the time to simplify and clean up your code.

 previously_seen_sub_urls = set() with open('d:/seizeFloorRec.txt', 'a') as outfile: outfile.write( '\ncur time: ' + time.strftime('%Y-%m-%d %H:%M:%S',time.localtime(time.time())) + '\n') driver = webdriver.Chrome() while True: try: print "time: ", time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(time.time())) sub_urls = aMethod(driver) # an irrelevant function which returns a list time.sleep(2) # Why sleep here? print "max_cnt=[%d]" % len(sub_urls) for i, sub_url in enumerate(sub_urls): print "cur_idx=[%s]" % i try: rtn = monitorFloorKeyword(sub_urls[i]) # rtn is either a length 1 tuple, first value False # or a length 4 tuple, (True, sub_url, title, content) time.sleep(1.5) if rtn[0]: if rtn[1] not in previously_seen_sub_urls: print "hit!" previously_seen_sub_urls.add(rtn[1]) outfile.write(rtn[1]+'\t'+rtn[2].encode('utf-8')+'\n') outfile.flush() else: print "hit but not write." except Exception as e: # Should catch specific subclass of Exception print "exception when get page: ", sub_urls[i] print e # Continues print "sleep 5*60 sec..." time.sleep(300) # PROBLEM POSSIBLY DOESN'T LIE HERE!!! print "sleep completes." except Exception as e: # Should catch specific subclass of Exception print 'exception!' print e time.sleep(20) # Continues 

I definitely did not find the problem, but I am suspicious of your exception handlers.

With your exception handlers, it is better to avoid "except the exception", except in very limited situations (for example, in the outermost loop of your code), because it shows that you do not know which exception (or at least a subclass of Exception) that you expect to receive, so it’s not clear if your actions are correct.

The second problem is that you are not printing the exception, but you are printing the doc doc line. For Python built-in exceptions, these lines may be useful, but they are not guaranteed for custom exceptions. You may find that exceptions are not displayed.

This does not explain your problem, but I would be interested to know if changing it will help to print the exception, not e.__doc__ . (See also traceback module to learn more about where the exception came from.)

0
source

So get rid of time.sleep and try using implicitly_wait

 ff = webdriver.Firefox() ff.implicitly_wait(30) 


or try using WebDriverWait

 ff = webdriver.Firefox() ff.get("http://somedomain/url_that_delays_loading") try: element = WebDriverWait(ff, 10).until(EC.presence_of_element_located((By.ID, "myDynamicElement"))) finally: ff.quit() 


also check out expectations in selenium

0
source

Source: https://habr.com/ru/post/1302494/


All Articles