I am working on a system that should be able to read any (or at least any well-formed) XML file, manipulate several nodes and write them back to the same file. I want my code to be as general as possible, and I don't want
- hard-coded links to Schema / Doctype information anywhere in my code. The information about doctype is in the original document, I want to save exactly this information about doping and not provide it again from my code. If the document does not have a DocType, I will not add it. I do not care about the form or content of these files at all, except for my few nodes.
- custom EntityResolvers or StreamFilters to omit or otherwise manipulate the source information (it is already a pity that the namespace information seems somehow inaccessible from the document file where it is declared, but I can control using more complex XPaths)
- DTD verification. I do not have reference DTDs, I do not want to include them, and Node manipulation is quite possible without knowing about them.
The goal is to keep the source file completely unchanged, with the exception of the modified nodes that are extracted through XPath. I would like to get away with the standard javax.xml file.
My progress:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setAttribute("http://xml.org/sax/features/namespaces", true);
factory.setAttribute("http://xml.org/sax/features/validation", false);
factory.setAttribute("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
factory.setAttribute("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
factory.setNamespaceAware(true);
factory.setIgnoringElementContentWhitespace(false);
factory.setIgnoringComments(false);
factory.setValidating(false);
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(new InputSource(inStream));
This loads the XML source into org.w3c.dom.Document successfully, ignoring DTD validation. I can do my replacements and then I use
Source source = new DOMSource(document);
Result result = new StreamResult(getOutputStream(getPath()));
Transformer xformer = TransformerFactory.newInstance().newTransformer();
xformer.transform(source, result);
. . Doctype , , . , DeferredDoctypeImpl [log4j: configuration: null] Document, - , . , , ( ):
<? xml version = "1.0" encoding = "UTF-8"? >
<! DOCTYPE log4j: SYSTEM "log4j.dtd" >
< log4j: xmlns: log4j = "http://jakarta.apache.org/log4j/" debug = "false" >
[...]
, (?) JAR . , .
Stephan