According to the official lxml documentation, if you want to check the xml document in the XML schema document, you need
- build an XMLSchema object (basically parse a schema document)
- build XMLParser by passing an XMLSchema object as an argument
schema - parse the actual XML document (instance document) using the built-in parser
There may be variations, but the essence is pretty much the same no matter how you do it — the schema is specified “externally” (as opposed to specifying it inside the actual XML document).
If you follow this procedure, validation is done, but if I understand it correctly, it completely ignores the whole idea of the XSI schemaLocation and noNamespaceSchemaLocation attributes
This introduces a number of limitations, starting with the fact that you have to deal with the relation of the ↔ all instance yourself (either store it from the outside, or write a hack to get the location of the scheme from the root element of the instance document), you cannot check the document with multiple schemas (for example, when each schema manages its own namespace), etc.
So the question is, maybe I'm missing something completely trivial or am I doing it wrong? Or my statements about lxml limitations regarding schema validation?
To remind, I would like to be able to:
- for the analyzer to use schema location declarations in the instance document during parsing / verification
- use multiple schemas to validate an XML document
- , root ( )
, ? , , - lxml - xml- python // ( , )