Search for the text of an element or source of the current page

Question

Search for the text of an element or source of the current page

I am doing the following in selenium 2 / webdrive using python and firefox ...

I open some web pages that I need to check for a specific line, which, if present, means that it is a good page to parse.

The phrase I'm looking for is an h2 element like this:

<h2 class="page_title">Worlds Of Fantasy : Medieval House</h2>

If this h2 is missing, I know that I do not need to work on it, just go back and get the next line.

In the code, I have a try / exception / else block that searches for a phrase if it sees that it goes to the next part of the function. If not, he should go to else, which tells him to return.

In this test, 2 pages are called - the first has the phrase, the second does not.

The first page opens and passes the test.

The second page is open, and I get an exception report, but it never returns to the calling code in main ... it just stops.

Why not an exception that returns the correct path?

Here is the code:

  #!/usr/bin/env python from selenium import webdriver from selenium.webdriver import Firefox as Browser from selenium.webdriver.support.ui import WebDriverWait browser = webdriver.Firefox() def call_productpage(productlink): global browser print 'in call_productpage(' + productlink + ')' browser.get(productlink) browser.implicitly_wait(8) #start block with <div class="page_content"> product_block = browser.find_element_by_xpath("//div[@class='page_content']"); # <h2 class="page_title">Worlds Of Fantasy : Medieval House</h2> try: product_name = product_block.find_element_by_xpath("//h2[@class='page_title']"); except Exception, err: #print "Failed!\nError (%s): %s" % (err.__class__.__name__, err) print 'return to main()' return 0 else: nameStr = str(product_name.text) print 'product_name:' + nameStr finally: print "test over!" return 1 test1 = call_productpage('https://www.daz3d.com/i/3d-models/-/desk-clocks?spmeta=ov&item=12657') if test1: print '\ntest 1 went OK\n' else: print '\ntest 1 did NOT go OK\n' tes2 = call_productpage('https://www.daz3d.com/i/3d-models/-/dierdre-character-pack?spmeta=ov&item=397') if test2: print '\ntest 2 went OK\n' else: print '\ntest 2 did NOT go OK\n'

And here is a screenshot of the console showing the exception that I get:

enter image description here

Another option that I was thinking about using was to get the page source from webdriver and find find to see if there is a tag there, but apparently there is no easy way to do this in webdriver!

0

python firefox webdriver

Stephen Jan 28 '12 at 6:09

source share

2 answers

This is the solution! Thanks!

Here is the final code, slightly refined to make the result more readable:

  #!/usr/bin/env python from selenium import webdriver from selenium.webdriver import Firefox as Browser from selenium.webdriver.support.ui import WebDriverWait browser = webdriver.Firefox() def call_productpage(productlink): global browser print 'in call_productpage(' + productlink + ')' browser.get(productlink) browser.implicitly_wait(1) product_block = '' try: product_block = browser.find_element_by_xpath("//div[@class='page_content']"); except: print 'this is NOT a good page - drop it' return 0 else: textStr = str(product_block.text) #print 'page_content:' + str(textStr) print '\nthis is a good page - proceed\n' print 'made it past the exception!\n' product_name = product_block.find_element_by_xpath("//h2[@class='page_title']"); nameStr = str(product_name.text) print '>>> product_name:' + nameStr + '\n' print "test over!" return 1 test1 = call_productpage('https://www.daz3d.com/i/3d-models/-/desk-clocks?spmeta=ov&item=12657') print '\nTest #1:\n============\n' if test1: print '\ntest 1 returned true\n' else: print '\ntest 1 returned false\n' print '\nTest #2:\n============\n' test2 = call_productpage('https://www.daz3d.com/i/3d-models/-/dierdre-character-pack?spmeta=ov&item=397') if test2: print '\ntest 2 returned true\n' else: print '\ntest 2 returned false\n' print '\n============\n'

And it works the way I need.

Thanks again.

0

Stephen Jan 28 '12 at 20:10

source share

Misha akovantsev · Accepted Answer · 2012-01-28T06:43:53+0000

If you don't know which of the expectations to expect, use empty except and traceback :

 import traceback try: int('string') except: traceback.print_exc() print "returning 0" # will print out an exception and execute everything in the 'except' clause: # Traceback (most recent call last): # File "<stdin>", line 2, in <module> # ValueError: invalid literal for int() with base 10: 'string' # returning 0

But from the stack trace, you already know the exact name of the exception, so use it instead:

 from selenium.webdriver.exceptions import NoSuchElementException try: #... except NoSuchElementException, err: #...

UPDATE:

You will get an exception before try ... except , here:

 product_block = browser.find_element_by_xpath("//div[@class='page_content']");

not here:

 product_name = product_block.find_element_by_xpath("//h2[@class='page_title']");

Search for the text of an element or source of the current page

More articles: