Python mechanize checks dates / times for exam from website

I am trying to check the availability date / time for an exam using Python mechanics and send someone an email if a specific date / time becomes available as a result (attached screenshot of the results page)

import mechanize from BeautifulSoup import BeautifulSoup URL = "http://secure.dre.ca.gov/PublicASP/CurrentExams.asp" br = mechanize.Browser() response = br.open(URL) # there are some errors in doctype and hence filtering the page content a bit response.set_data(response.get_data()[200:]) br.set_response(response) br.select_form(name="entry_form") # select Oakland for the 1st set of checkboxes for i in range(0, len(br.find_control(type="checkbox",name="cb_examSites").items)): if i ==2: br.find_control(type="checkbox",name="cb_examSites").items[i].selected =True # select salesperson for the 2nd set of checkboxes for i in range(0, len(br.find_control(type="checkbox",name="cb_examTypes").items)): if i ==1: br.find_control(type="checkbox",name="cb_examTypes").items[i].selected =True reponse = br.submit() print reponse.read() 

I can get an answer, but for some reason the data in my table is missing

here are the buttons from the html start page

 <input type="submit" value="Get Exam List" name="B1"> <input type="button" value="Clear" name="B2" onclick="clear_entries()"> <input type="hidden" name="action" value="GO"> 

one part of the output (send a response) where the actual data

 <table summary="California Exams Scheduling" class="General_list" width="100%" cellspacing="0"> <EVERTHING INBETWEEN IS MISSING HERE> </table> 

All data in the table are missing. I provided a screenshot of the table element from the Chrome browser.

  • Can someone please tell me what could be wrong?
  • Can someone please tell me how to get the date / time of the response (assuming I have to use BeautifulSoup) and therefore there should be something in these lines. I am trying to find out if a specific date, which I mean (for example, March 8) in the answer, starts at 13:30. Underlined image

    soup = BeautifulSoup (response.read ()) print soup.find (name = "table")


update - it looks like my problem may be related to this question , and I'm trying to fulfill my options. I tried the following according to one of the answers, but I don't see any tr elements in the data (although I can see this in the page source when I check it manually)

 soup.findAll('table')[0].findAll('tr') 

enter image description here


Update. Change this to use selenium, try to do something else soon.

 from selenium.common.exceptions import NoSuchElementException from selenium.webdriver.common.keys import Keys from bs4 import BeautifulSoup import urllib3 myURL = "http://secure.dre.ca.gov/PublicASP/CurrentExams.asp" browser = webdriver.Firefox() # Get local session of firefox browser.get(myURL) # Load page element = browser.find_element_by_id("Checkbox5") element.click() element = browser.find_element_by_id("Checkbox13") element.click() element = browser.find_element_by_name("B1") element.click() 
+5
source share

Source: https://habr.com/ru/post/1244521/


All Articles