Xpath to get the second url with the corresponding text in the href tag

Question

Xpath to get the second url with the corresponding text in the href tag

The html page has page links, 1 at the top of the page, and the other at the bottom below.

Using HtmlUnit, I am currently getting HtmlAnchor on the page using getByAnchorText("1");

Some links above have a problem, so I want to link to lower links using XPath.

 nextPageAnchor = (HtmlAnchor) page.getByXPath("");

How can I link to the second link on the page using xpath?

I need to reference the link using AnchorText, so the link is like:

 <a href="....">33</a>

href has random text and is a javascript function, so I have no idea what it will be.

Is this possible with xpath?

+4

java xpath htmlunit

Blankman Apr 12 '10 at 0:31

source share

2 answers

It is pretty simple:

  (//a)[2]

the //a gets all the bindings on the page, and [2] gets the second (it is single-index, not null-indexed, so 2 is actually the second, not the third, as you would expect with an array, for example)

If you want to get a link to text 33 , you can use:

  //a[./text() = "33"]

See http://www.w3.org/TR/xpath/ for a full definition of xpath.

EDIT

To respond to Alexander's comment, you can use

  (//a[./text() = "33"])[2]

First select all the <a> tags with text 33, and then select the second one.

EDIT 2

NOTE. The location path // para [1] does not mean the same as the path / descendant of the location :: para [1]. The latter selects the first descendant paragraph element; the first selects all descendant elements that are the first pair of children of their parents.

Marcus is really right. The above quote refers to the xPath definition mentioned above.

+4

Jonathan fingland Apr 12 '10 at 0:38

source share

markusk · Accepted Answer · 2010-04-12T05:55:51+0000

To select the second element a anywhere in the document:

 (//a)[2]

To select the second element a with specific text in the href attribute:

 (//a[@href='...'])[2]

Note that parentheses are needed and that the expression //a[2] will not do what you intend: it will select all a elements that are the second element a any parent. If your input

 <p>Link <a href="one.html">One</a></p> <p>Link <a href="two.html">Two</a> and <a href="three.html">Three</a>.</p> <p>Link <a href="four.html">Four</a> and <a href="five.html">Five</a>.</p>

(//a)[2] will return the second link (two.html), and //a[2] will return the third and fifth links (three.html and five.html), since both of them are the second a child ancestor.

Xpath to get the second url with the corresponding text in the href tag

More articles: