How to get all xpaths from XSD?

Question

How to get all xpaths from XSD?

I have an XSD, and I need to specify the xpath of all the elements present in the XSD in the user interface so that users can use it to perform some DOM-related operations.

Can I programmatically extract xpaths from all elements from XSD?

+1

java xpath xsd

suraj bahl Jun 30 '15 at 13:31

source share

3 answers

Michael kay · Answer 1 · 2015-06-30T14:53:12+0000

This can be done, although you need to know that the set of all allowed paths is infinite (for example, due to recursion or because of wildcards), so you need an intelligent representation of this infinite set, or your code will need to refuse and return something like "anything" if you find that the list cannot be listed. A schema-oriented Saxon product does something like this when validating a path expression, for example. // para, against the scheme: if he knows the type of context element, he can determine if he is capable. // para select something, and gives you a warning if not.

As a first step, you need to assemble (the corresponding part) a model of a schematic component from documents of the original schema. Do not try to do it yourself, it is too much work. A number of products have an API that allows you to access the schematic component model. Saxon allows you to generate a schema component model from schema schema documents as an XML representation using the -scmout flag on the Validate command line.

Once you have a model of the circuit component, you can find the allowed children of the element by going to its complex type (if it is a simple type, then the answer is trivial) and recursively crosses the particle tree, looking only at the element particles and the substitution particles (you can solve that if there are permutation particles, it is better to refuse). You might want to consider not only the declared type of the element, but also other types derived from this extension. You need to know the declarations of the elements of allowed children, and not just the names of the children of the allowed, because, of course, when it comes to finding allowed grandchildren, you need to start with the declaration of the element, as there may be local declarations of elements with the same name.

And, of course, when you know the relationship between element names and their resolved child elements, a set of paths is a transitive closure of this relationship.

arseniyandru · Answer 2 · 2015-06-30T13:45:38+0000

Node n = doc.getFirstChild(); NodeList nl = n.getChildNodes();

Then you can try looking at the list of nodes and get each node XPath

 String getXPath(Node node) { Node parent = node.getParent(); if (parent == null) { return "/" + node.getTagName(); } return getXPath(parent) + "/"; }

bbarker · Answer 3 · 2017-07-10T20:09:27+0000

I was working on a project that has methods for 1) extracting all xpaths of elements that are present in the xml document itself (for example, a schema definition document) or 2) listing all possible xpaths that can be found in the XML document described by XSD.

If you are only interested in 1) the problem and my solution were described and answered (albeit in Scala) to Scala: what is the easiest way to get all leaf nodes and their paths in XML?

For 2) everything is much more complicated, although I actually used 1) as a starting point, and both: 1) ( XpathXmlEnumerator ) and 2) ( XpathXsdEnumerator ) have a common interface ( XpathEnumerator ), no matter what. Although 2) is much longer, I believe that ~ 500 VOCs are still a rather poor implementation, all the things considered (but probably could use more comments - please add them!). @michael-kay did a great job of describing many difficulties and describing a possible solution. Perhaps, unfortunately, I did not follow his advice on using software that understands the model of schema components, but I used scala.xml to try to simplify the work with xml nodes in general. However, I believe that I have overcome all the known difficulties of generating xpaths, since XSD has a high percentage of information / nodes, which is not necessary to generate XPath in documents described by XSD, so one can simply ignore such nodes.

The idea of filtering becomes important in order to avoid counting nodes that appear everywhere, and in practice you are not interested, and perhaps also avoid recursion. However, recursion should be automatically detected by the implementation in clause 2), and further traversal of this xpath was excluded. For filters, the use of the custom class NodeFilters - see DdiCodebookSpec , for example, use.

You can see some tests that run in the project in the same directory as ShipOrderXsdSpec , which contains some quick-launch examples if you want to try it. Some of the other tests do not work quickly, and some of them have problems - this is the "pre-alpha" software!

Although the solutions are in Scala, I would be happy to create a Java shell (if necessary, it can work directly) and even publish it to Maven if anyone really wants to.

How to get all xpaths from XSD?

More articles: