XPath Expression returns nothing for // an element, but // * returns a counter

I am using XOM with the following data examples:

Element root = cleanDoc.getRootElement(); //find all the bold elements, as those mark institution and clinic. Nodes nodes = root.query("//*"); <html xmlns="http://www.w3.org/1999/xhtml" xmlns:html="http://www.w3.org/1999/xhtml"> <head> <title>Patient Information</title> </head> </html> 

The following element returns many elements (from real data):

 //* 

but something like

 //head 

Returns nothing. If I run the children of the root, the numbers seem to match, and if I print the name of the element, everything looks right.

I take the HTML, parse it with tagoup, and then create an XOM document from the resulting string. How much of this can go so terribly wrong? I feel like there is some kind of weird encoding problem, but I just don't see it. Java strings are strings, right?

+4
source share
1 answer

Your document has a default namespace, which means that in the XPath model, all elements are in that namespace.

The request should be //html:head . You will need to specify a namespace mapping in the XPath query.

Note that although the XPath expression uses a namespace prefix, this must match the namespace uri.

 XPathContext ctx = new XPathContext("html", "http://www.w3.org/1999/xhtml"); Nodes nodes = root.query("//html:head", ctx ); 
+6
source

Source: https://habr.com/ru/post/1302189/


All Articles