Can I tell XmlTextWriter to write <element / "> instead of <element /">?
I have a situation where XML data is processed by two different mechanisms. In one place, it is processed using the Python library xml.dom.minidom. In another, similar processing is done in .NET through an XmlTextWriter.
The output generated by the Python code writes empty <ElementName /> elements (without a space until the element is closed). Space is inserted in the .NET code (as a result, <ElementName /> appears). This has nothing to do with the reality or value of XML, but it leads to the fact that the output is defined as different when comparing the two outputs.
Is there a way to tell XmlTextWriter not to include extra space? Otherwise, is there a way to include extra space in Python generated output (without going to the source of the library, which is possibly something that I find undesirable ;-))?
Update: Perhaps I should explain what I'm trying to do, and not just describe the problem. Perhaps I am making things more complicated / painful than I should.
I really need some mechanism to determine that the structure represented by XML has not been changed. Initially, I smoothed XML (which fixed whitespace problems when everything was done in the .NET world), and then computed a salty data hash appropriately. Is there a better mechanism that I could / should use?
This is probably not the answer you need: don't compare the XML output as plain text. We do this for our unit tests (two applications that exchange xml messages) and it is fragile, easily broken, annoying and requires a lot of maintenance. You have to parse the xml output and compare the structure - writing such a tool will require more work (maybe alrady is there), but when the result changes a little in the next version of any of the libraries, it will work anyway.
Change Well, now that you have explained your problem a little more, let me understand if I understand correctly: you have data for which you are generating XML output. Sometimes through .NET, and sometimes through Python. Let's say you create an output through .NET, then you compute the hash on it and save it. Later, you create output through Python, which should have the same content, and you also calculate the hash for it. Now the two hashes are not equal due to a space problem.
If this is the case, you can go through the XML document and calculate the hash based on the visible nodes with their attributes and values. A simpler approach would be to remove all unnecessary spaces from the output (no matter where the output comes from), and then do your hash calculation. You can do it in Python ;)
You will find that the problem only occurs if you set the Indent property in XmlWriterSettings to true . When Indent == false , no place is inserted. But if you want padding, you need to live with this space.
So, perhaps the solution to your program is to disable padding in both tools?
This is unsuccessful because it is practically possible to change.
The XmlWriter implementation actually calls XmlWriterSettings.CreateWriter to create an entry based on the parameters you pass. If Indent == true , then it creates an XmlEncodedRawTextWriterIndent , which is an inner class derived from an abstract XmlWriter . It overrides WriteFullEndElement and inserts this space.
In theory, you can create your own class derived from XmlEncodedRawTextWriterIndent , which overrides WriteFullEndElement . If you could do this, it would be easy to prevent indentation. But you cannot do this because it is an internal class (internal to System.Xml ). Even if you can subclass XmlEncodedRawTextWriterIndent , you XmlEncodedRawTextWriterIndent have a problem: XmlWriterSettings.CreateXmlWriter not able to instantiate the class, and XmlWriterSettings is sealed .
I assume there are good reasons for effectively preventing the creation of custom XmlWriter classes, although they are currently avoiding me.