Get the body of an HTML table in Python using Selenium

Question

Get the body of an HTML table in Python using Selenium

I will break the following page: https://proximity.niceic.com/mainform.aspx

First enter "%%" in the country text box to display all contractors in the area. When I'm in, if I check the HTML in devtools, I get the following:

I want to extract all the information from the selected table. The problem is that when I refuse it with selenium, I find the table, but I can not access her body or daughters.

Here is my Python code:

main_table = driver.find_elements_by_tag_name('table')
outer_table = main_table[3].find_element_by_tag_name('table')
print outer_table.get_attribute('innerHTML')

The above code outputs the following:

<table cellspacing="0" rules="all" bordercolor="Silver" border="1" id="dvContractorDetail" style="background-color:White;border-color:Silver;border-width:1px;border-style:Solid;height:200px;width:400px;border-collapse:collapse;">

</table>

Run code Hide result

As you can see, I can only get the table tag, but none of its components, for example tbody or all tr tags in the tbody tag

What can I do?

+4

python html selenium web scraping

Ian spitz Jan 21 '18 at 6:30

source share

1

Keyur Potdar · Accepted Answer · 2018-01-21T06:53:34+0000

, , JS . , . Waits .

Explicit Wait. :

.

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException

main_table = driver.find_elements_by_tag_name('table')
outer_table = main_table[3].find_element_by_tag_name('table')
print outer_table.get_attribute('innerHTML')

to

try:
    WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, 'gvContractors')))
except TimeoutException:
    pass  # Handle the exception here
table = driver.find_element_by_id('gvContractors').get_attribute('innerHTML')
print(table)

. , , ,

print('Company/Address' in table)

True

:
_by_tag_name _by_id . ( id="gvContractors")

Get the body of an HTML table in Python using Selenium

More articles: