How to query XML using namespaces in Java with XPath?

When my XML looks like this (no xmlns ), I can easily query it with XPath, for example /workbook/sheets/sheet[1]

 <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <workbook> <sheets> <sheet name="Sheet1" sheetId="1" r:id="rId1"/> </sheets> </workbook> 

But when it looks like this, I can’t

 <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"> <sheets> <sheet name="Sheet1" sheetId="1" r:id="rId1"/> </sheets> </workbook> 

Any ideas?

+61
java xml xpath xml-namespaces
Jun 17 '11 at 18:45
source share
8 answers

In the second example XML file, the elements are mapped to a namespace. Your XPath is trying to address elements bound to the default "no namespace" namespace, so they do not match.

The preferred method is to register the namespace with a namespace prefix. This simplifies the development, reading, and maintenance of XPath.

However, it is not necessary to register the namespace and use the namespace prefix in XPath.

You can formulate an XPath expression that uses a common match for an element and a predicate filter that limits the match for the desired local-name() and namespace-uri() . For example:

 /*[local-name()='workbook' and namespace-uri()='http://schemas.openxmlformats.org/spreadsheetml/2006/main'] /*[local-name()='sheets' and namespace-uri()='http://schemas.openxmlformats.org/spreadsheetml/2006/main'] /*[local-name()='sheet' and namespace-uri()='http://schemas.openxmlformats.org/spreadsheetml/2006/main'][1] 

As you can see, it creates an extremely long and verbose XPath statement that is very difficult to read (and maintain).

You can also just map the local-name() element of the element and ignore the namespace. For example:

 /*[local-name()='workbook']/*[local-name()='sheets']/*[local-name()='sheet'][1] 

However, you risk combining the wrong elements. If your XML has mixed dictionaries (which might not be a problem for this instance) that use the same local-name() , your XPath could match the wrong elements and select the wrong content:

+65
Jun 18 2018-11-18T00:
source share

Your problem is the default namespace. Check out this article to learn how to handle namespaces in your XPath: http://www.edankert.com/defaultnamespaces.html

One of their conclusions:

So, to use XPath expressions on XML content defined in (default), we need to specify a namespace prefix mapping

Note that this does not mean that you need to modify the source document in any way (although you can put namespace prefixes there if you want). Sounds weird, right? What you will do is create a namespace prefix mapping in your Java code and use the specified prefix in the XPath expression. Here we will create a mapping from spreadsheet to your default namespace.

 XPathFactory factory = XPathFactory.newInstance(); XPath xpath = factory.newXPath(); // there no default implementation for NamespaceContext...seems kind of silly, no? xpath.setNamespaceContext(new NamespaceContext() { public String getNamespaceURI(String prefix) { if (prefix == null) throw new NullPointerException("Null prefix"); else if ("spreadsheet".equals(prefix)) return "http://schemas.openxmlformats.org/spreadsheetml/2006/main"; else if ("xml".equals(prefix)) return XMLConstants.XML_NS_URI; return XMLConstants.NULL_NS_URI; } // This method isn't necessary for XPath processing. public String getPrefix(String uri) { throw new UnsupportedOperationException(); } // This method isn't necessary for XPath processing either. public Iterator getPrefixes(String uri) { throw new UnsupportedOperationException(); } }); // note that all the elements in the expression are prefixed with our namespace mapping! XPathExpression expr = xpath.compile("/spreadsheet:workbook/spreadsheet:sheets/spreadsheet:sheet[1]"); // assuming you've got your XML document in a variable named doc... Node result = (Node) expr.evaluate(doc, XPathConstants.NODE); 

And voila ... Now you have saved your element in the result variable.

Caution: if you are parsing XML as a DOM with standard JAXP classes, be sure to call setNamespaceAware(true) on the DocumentBuilderFactory . Otherwise, this code will not work!

+57
Jun 17 2018-11-18T00:
source share

All namespaces that you intend to select from the source XML must be prefixed in the host language. In Java / JAXP, this is done by specifying a URI for each namespace prefix using an instance of javax.xml.namespace.NamespaceContext . Unfortunately, the SDK does not have a NamespaceContext implementation .

Fortunately, it's very easy to write your own:

 import java.util.HashMap; import java.util.Iterator; import java.util.Map; import javax.xml.namespace.NamespaceContext; public class SimpleNamespaceContext implements NamespaceContext { private final Map<String, String> PREF_MAP = new HashMap<String, String>(); public SimpleNamespaceContext(final Map<String, String> prefMap) { PREF_MAP.putAll(prefMap); } public String getNamespaceURI(String prefix) { return PREF_MAP.get(prefix); } public String getPrefix(String uri) { throw new UnsupportedOperationException(); } public Iterator getPrefixes(String uri) { throw new UnsupportedOperationException(); } } 

Use it as follows:

 XPathFactory factory = XPathFactory.newInstance(); XPath xpath = factory.newXPath(); HashMap<String, String> prefMap = new HashMap<String, String>() {{ put("main", "http://schemas.openxmlformats.org/spreadsheetml/2006/main"); put("r", "http://schemas.openxmlformats.org/officeDocument/2006/relationships"); }}; SimpleNamespaceContext namespaces = new SimpleNamespaceContext(prefMap); xpath.setNamespaceContext(namespaces); XPathExpression expr = xpath .compile("/main:workbook/main:sheets/main:sheet[1]"); Object result = expr.evaluate(doc, XPathConstants.NODESET); 

Note that although the first namespace does not specify a prefix in the source document (i.e. this is the default namespace ), you should still associate it with the prefix. Your expression should then refer to the nodes in this namespace using the prefix you selected, for example:

 /main:workbook/main:sheets/main:sheet[1] 

The prefix names that you want to associate with each namespace are arbitrary; they don’t need to map what appears in the source XML. This mapping is just a way to tell the XPath engine that a given prefix name in an expression correlates with a specific namespace in the source document.

+36
Jun 17 '11 at 23:11
source share

If you use Spring, it already contains org.springframework.util.xml.SimpleNamespaceContext.

  import org.springframework.util.xml.SimpleNamespaceContext; ... XPathFactory xPathfactory = XPathFactory.newInstance(); XPath xpath = xPathfactory.newXPath(); SimpleNamespaceContext nsc = new SimpleNamespaceContext(); nsc.bindNamespaceUri("a", "http://some.namespace.com/nsContext"); xpath.setNamespaceContext(nsc); XPathExpression xpathExpr = xpath.compile("//a:first/a:second"); String result = (String) xpathExpr.evaluate(object, XPathConstants.STRING); 
+3
Jan 30 '18 at 13:45
source share

Make sure you reference the namespace in XSLT

 <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" > 
+1
Jun 17 '11 at 19:20
source share

I wrote a simple implementation of NamespaceContext ( here ) that accepts the input Map<String, String> , where key is the prefix and value is the namespace.

This follows from the NamespaceContext spesification, and you can see how it works in unit tests .

 Map<String, String> mappings = new HashMap<>(); mappings.put("foo", "http://foo"); mappings.put("foo2", "http://foo"); mappings.put("bar", "http://bar"); context = new SimpleNamespaceContext(mappings); context.getNamespaceURI("foo"); // "http://foo" context.getPrefix("http://foo"); // "foo" or "foo2" context.getPrefixes("http://foo"); // ["foo", "foo2"] 

Please note that it has a dependency on Google Guava

+1
Sep 28 '15 at 10:43
source share

Two things to add to existing answers:

  • I don’t know if this was the case when you asked the question: in Java 10, your XPath really works for the second document, unless you use setNamespaceAware(true) in the document builder factory ( false by default).

  • If you want to use setNamespaceAware(true) , other answers have already shown how to do this using the namespace context. However, you do not need to map the prefixes to namespaces yourself, as answers do: they are already in the document element, and you can use this for the namespace context:

 import java.util.Iterator; import javax.xml.namespace.NamespaceContext; import org.w3c.dom.Document; import org.w3c.dom.Element; public class DocumentNamespaceContext implements NamespaceContext { Element documentElement; public DocumentNamespaceContext (Document document) { documentElement = document.getDocumentElement(); } public String getNamespaceURI(String prefix) { return documentElement.getAttribute(prefix.isEmpty() ? "xmlns" : "xmlns:" + prefix); } public String getPrefix(String namespaceURI) { throw new UnsupportedOperationException(); } public Iterator<String> getPrefixes(String namespaceURI) { throw new UnsupportedOperationException(); } } 

The rest of the code is the same as in the other answers. Then XPath /:workbook/:sheets/:sheet[1] returns the sheet element. (You can also use a non-empty prefix for the default namespace, as other answers do, replacing prefix.isEmpty() , for example, prefix.equals("spreadsheet") and using XPath /spreadsheet:workbook/spreadsheet:sheets/spreadsheet:sheet[1] .)

PS: I just found here that there actually is a Node.lookupNamespaceURI(String prefix) method, so you can use it instead of searching for an attribute:

  public String getNamespaceURI(String prefix) { return documentElement.lookupNamespaceURI(prefix.isEmpty() ? null : prefix); } 

Also note that namespaces can be declared for elements other than a document element, and they will not be recognized (in no version).

0
Jul 20 '19 at 12:42
source share

Surprisingly, if I did not install factory.setNamespaceAware(true); then the xpath you mentioned works with and without a namespace. You simply cannot select things with the specified namespace only general xpaths. Go figure it out. So this could be an option:

  DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); factory.setNamespaceAware(false); 
-one
Feb 06 '19 at 23:40
source share



All Articles