Python mechanize checks dates / times for exam from website

Question

Python mechanize checks dates / times for exam from website

I am trying to check the availability date / time for an exam using Python mechanics and send someone an email if a specific date / time becomes available as a result (attached screenshot of the results page)

import mechanize from BeautifulSoup import BeautifulSoup URL = "http://secure.dre.ca.gov/PublicASP/CurrentExams.asp" br = mechanize.Browser() response = br.open(URL) # there are some errors in doctype and hence filtering the page content a bit response.set_data(response.get_data()[200:]) br.set_response(response) br.select_form(name="entry_form") # select Oakland for the 1st set of checkboxes for i in range(0, len(br.find_control(type="checkbox",name="cb_examSites").items)): if i ==2: br.find_control(type="checkbox",name="cb_examSites").items[i].selected =True # select salesperson for the 2nd set of checkboxes for i in range(0, len(br.find_control(type="checkbox",name="cb_examTypes").items)): if i ==1: br.find_control(type="checkbox",name="cb_examTypes").items[i].selected =True reponse = br.submit() print reponse.read()

I can get an answer, but for some reason the data in my table is missing

here are the buttons from the html start page

 <input type="submit" value="Get Exam List" name="B1"> <input type="button" value="Clear" name="B2" onclick="clear_entries()"> <input type="hidden" name="action" value="GO">

one part of the output (send a response) where the actual data

 <table summary="California Exams Scheduling" class="General_list" width="100%" cellspacing="0"> <EVERTHING INBETWEEN IS MISSING HERE> </table>

All data in the table are missing. I provided a screenshot of the table element from the Chrome browser.

Can someone please tell me what could be wrong?
Can someone please tell me how to get the date / time of the response (assuming I have to use BeautifulSoup) and therefore there should be something in these lines. I am trying to find out if a specific date, which I mean (for example, March 8) in the answer, starts at 13:30. Underlined image
soup = BeautifulSoup (response.read ()) print soup.find (name = "table")

update - it looks like my problem may be related to this question , and I'm trying to fulfill my options. I tried the following according to one of the answers, but I don't see any tr elements in the data (although I can see this in the page source when I check it manually)

 soup.findAll('table')[0].findAll('tr')

Update. Change this to use selenium, try to do something else soon.

 from selenium.common.exceptions import NoSuchElementException from selenium.webdriver.common.keys import Keys from bs4 import BeautifulSoup import urllib3 myURL = "http://secure.dre.ca.gov/PublicASP/CurrentExams.asp" browser = webdriver.Firefox() # Get local session of firefox browser.get(myURL) # Load page element = browser.find_element_by_id("Checkbox5") element.click() element = browser.find_element_by_id("Checkbox13") element.click() element = browser.find_element_by_name("B1") element.click()

+5

python web-scraping mechanize

Naresh MG Mar 6 '16 at 13:12

source share

No one has answered this question yet.

See similar questions:

3

how to get tbody from table from beautiful python soup?

or similar:

2601

How can I make a time delay in Python?

2568