XPath: select nodes with explicit attribute "xmlns"

Can anyone specify an XPath expression that selects all nodes with the explicit attribute "xmlns", for example. <html xmlns="http://www.w3.org/1999/xhtml"> ? //*[@xmlns] does not work, because (as it turned out) xmlns is not considered as an XPath attribute.

 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="X-UA-Compatible" content="IE=edge"/> <title>  , </title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> <meta http-equiv="cache-control" content="no-cache"/> <meta http-equiv="pragma" content="no-cache"/> ....... 

I only need the "html" node here.

+6
source share
2 answers

The technically correct answer is that this is ...

Impossible. You need to distinguish between an abstract document that represents the source text and the actual text of the source text. XPath works with abstraction, not source, and the location of the pseudo xmlns attribute only applies to the latter.

However...

You can fake it with the following XPath 2.0 expression:

 //*[not(namespace-uri()=ancestor::*/namespace-uri())] 

This selects any element that does not have an ancestor in the same namespace, which theoretically means that it selects all elements in which the namespace is declared. However, it will not intercept namespaces that will be re-declared. For example, consider this document:

 <html xmlns="http://www.w3.org/1999/xhtml"> <head/> <body> <p xmlns="http://something"> <p xmlns="http://something"/> </p> </body> </html> 

The above expression selects the html element and the first p . The second p has an ancestor in the same namespace, so it is not selected, although it indicates xmlns .

+9
source

It should not be possible because

 <a xmlns="http://www.org/1"> <b/> </a> 

equivalently

 <a xmlns="http://www.org/1"> <b xmlns="http://www.org/1"/> </a> 
+3
source

Source: https://habr.com/ru/post/907045/


All Articles