Selecting specific table cells in the Selenium web driver (Python)

I am trying to extract information from a link from a page that is structured as such:

...

<td align="left" bgcolor="#FFFFFF">$725,000</td>

<td align="left" bgcolor="#FFFFFF"> Available</td>

*<td align="left" bgcolor="#FFFFFF">
    <a href="/washington">


 Washington Street Studios
<br>1410 Washington Street SW<br>Albany, Oregon, 97321
</a>
</td>*

<td align="center" bgcolor="#FFFFFF">15</td>

<td align="center" bgcolor="#FFFFFF">8.49%</td>

<td align="center" bgcolor="#FFFFFF">$48,333</td>

</tr>

I tried targeting elements with the 'align = left' attribute and iterating over it, but that didn't work. If anyone could help me find an element <a href = "/washington">(multiple tags like these on the same page) with selenium, I would appreciate it.

0
source share
2 answers

I would use lxml instead if it is just processing hxml ...

It would be helpful if you are more specific, but you can try this if you are viewing links on a web page.

from lxml.html import parse

pdoc = parse(url_of_webpage)
doc = pdoc.getroot()
list_of_links = [i[2] for i in  doc.iterlinks()]

list_of_links ['/en/images/logo_com.gif', 'http://www.brand.com/', '/en/images/logo.gif ']

doc.iterlinks() , form, img, a-tags , Element, , (form, a img), URL- ,

list_of_links = [i[2] for i in  doc.iterlinks()]
URL- .

, URL- . URL-,

'/en/images/logo_com.gif'

'http://somedomain.com/en/images/logo_com.gif'

URL,

from lxml.html import parse
pdoc = parse(url_of_webpage)
doc = pdoc.getroot()
doc.make_links_absolute()     #  add this line
list_of_links = [i[2] for i in  doc.iterlinks()]

URL- , -

for i in iterlinks():
    url = i[2]
    # some processing here with url...

, - , -,

from selenium import webdriver
from StringIO import StringIO

browser = webdriver.Firefox()
browser.get(url)
doc = parse(StringIO(browser.page_source)).getroot()
0

, , table, . " " , , :

for row in driver.find_elements_by_css_selector("table#myid tr"):
    cells = row.find_elements_by_tag_name("td")

    print(cells[2].text)  # put a correct index here
0

Source: https://habr.com/ru/post/1607866/


All Articles