1. Browsers often change HTML
Browsers often change the HTML that they serve to make it "valid." For example, if you are serving a browser, this invalid HTML:
<table> <p>bad paragraph</p> <tr><td>Note that cells and rows can be unclosed (and valid) in HTML </table>
To do this, the browser is useful and tries to make it valid HTML and can convert it to:
<p>bad paragraph</p> <table> <tbody> <tr> <td>Note that cells and rows can be unclosed (and valid) in HTML</td> </tr> </tbody> </table>
The above because <p> cannot be inside <table> and <tbody> . What changes apply to the source can be very different from the browser. Some will put invalid elements in front of tables, some after, some inside cells, etc.
2. Xpaths are not fixed, they are flexible when pointing to elements.
Using this “fixed” HTML:
<p>bad paragraph</p> <table> <tbody> <tr> <td>Note that cells and rows can be unclosed (and valid) in HTML</td> </tr> </tbody> </table>
If we try to target the text of the <td> cell, all of the following will give you approximately the correct information:
//td //tr/td //tbody/tr/td /table/tbody/tr/td /table//*/text()
And this list goes on ...
however, in a general browser you will get the most accurate (and least flexible) XPath, which lists all the elements from the DOM. In this case:
/table[0]/tbody[0]/tr[0]/td[0]/text()
3. Conclusion: the browser provided by Xpaths is usually useless
This is why XPaths created by developer tools often gives you the wrong Xpath when trying to use raw HTML.
The solution always refers to raw HTML and uses flexible but accurate XPath.
Examine the actual HTML that contains the price:
<table border="0" cellspacing="0" cellpadding="0"> <tr> <td> <font class="pricecolor colors_productprice"> <div class="product_productprice"> <b> <font class="text colors_text">Price:</font> <span itemprop="price">$149.95</span> </b> </div> </font> <br/> <input type="image" src="/v/vspfiles/templates/MAKO/images/buttons/btn_updateprice.gif" name="btnupdateprice" alt="Update Price" border="0"/> </td> </tr> </table>
If you need a price, there really is only one place!
//span[@itemprop="price"]/text()
And this will return:
$149.95