Is HtmlUnit 2.8 getFirstByXPath different from HtmlUnit 1.14 getFirstByXPath?

I have a site structure that looks something like this:

<div class='main_container'> <div class='item_container'> <div class='body'> <span class='item_name'>Item 1</span> <span class='item_desc'>Desc 1</span> </div> </div> <div class='item_container'> <div class='body'> <span class='item_name'>Item 2</span> <span class='item_desc'>Desc 2</span> </div> </div> ... </div><!--End of main_container--> //Note: Some divs might not have <span @class='item_name'>Item N</span> or other elements inside the item_container 

In HtmlUnit 1.14, if I want to get the whole name of an element:

 List<HtmlDivision> divs = (List<HtmlDivision>)page.getByXPath("//div[@class='item_container']"); for(HtmlDivision div:divs){ String name = ((HtmlElement)div.getFirstByXPath("//span[@class='item_name']")).asText(); System.out.println(name); } 

Conclusion:

 Item 1 Item 2 ... 

But in HtmlUnit 2.8, when I do the same thing as me.

 Item 1 Item 1 ... 

Is there a workaround for this in HtmlUnit 2.8?

+6
source share
1 answer

Maybe HtmlUnit 1.4 had an error that you used / relied on.

In the code you showed, XPath inside the for loop should return the same element every time it is executed (as is done in v2.8), since it starts with // , which looks through the entiredocument , starting from the root of the node and returns the first one he finds.

If you want it to be relative from the <div> in the loop, you must configure XPath to: .//span[@class='item_name']

+6
source

Source: https://habr.com/ru/post/892736/


All Articles