How to save XML nodes that are not bound to an object when using SAX for parsing

I am working on an Android application that interacts with a bluetooth camera. For each clip stored on the camera, we save some fields about the clip (some of which the user can change) in the XML file.

This application is currently the only application that writes this xml data to the device, but in the future it is possible that a desktop application or iphone application can write data here. I do not want to make the assumption that another application also cannot have additional fields (especially if they had a newer version of the application that added new fields, this version did not yet support).

So what I want to prevent is a situation where we add new fields to this XML file in another application, and then the user switches to using the android application and wipes it from these other fields, because he does not know about them.

So, let's take a hypothetical example:

<data> <title>My Title</title> <date>12/24/2012</date> <category>Blah</category> </data> 

When reading from the device, this will be translated into a Clip object, which looks like this: (simplified for brevity)

 public class Clip { public String title, category; public Date date; } 

Therefore, I use SAX to analyze data and store it in a clip. I just store the characters in a StringBuilder and write them down when I get to the final element for name, category and date.

I realized that when I write this data back to the device, if there are any other tags in the original document, they will not write, because I write only the fields that I know of.

This makes me think that maybe SAX is the wrong option, and maybe I should use the DOM or something else where I could more easily write down any other elements that originally existed.

As an alternative, I thought, maybe my Clip class contains an ArrayList of some general XML type (possibly DOM), and in startTag I check to see if this element is one of the predefined tags, and if so, until until I get to the end of this tag I keep the whole structure (but in what?). Then, after recording, I just looked at all the additional tags and wrote them into an XML file (along with the fields that I know of course)

Is this a common problem with a well-known solution?

- Update 5/22/12 -

I did not mention that in the actual xml root node (actually called the annotation) we use the version number that was set to 1. What I am going to do in the short term requires that the version number supported by my application is> = version number from xml data. If xml is a larger number, I will try to parse for reading, but will refuse any model save. I'm still interested in some kind of working example, but how to do it.

By the way, I was thinking of another solution, which should be fairly simple. I believe that I can use XPATH to find the nodes that I know of and to replace the contents for these nodes when updating data. However, I did some tests, and the overhead is absurd when parsing xml when it is parsed in memory. Just the parsing operation, without even making any searches, led to the fact that the performance was 20 times worse than SAX. Using xpath was 30-50 times slower overall for parsing, which was very bad considering that I am parsing them as a list. So my idea is to SAX parse the nodes in the clips, but store all the XML in a variable of the Clip class (remember, this xml is short, less than 2 KB). Then, when I move on to writing data, I could use XPATH to replace the nodes that I know about in the source XML.

However, they are still interested in any other solutions. I probably will not make a decision, although it does not contain some code examples.
+6
source share
4 answers

Here you can do it with SAX filters :

  • When you read a SAX document, you record all events. You record them and throw them further to the next level of the SAX reader. You basically add up two layers of SAX readers (with XMLFilter ) - one will record and relay, and the other is your current SAX handler that creates the objects.
  • When you are ready to write your changes to disk, you run the recorded SAX events superimposed by your author, which will overwrite the values โ€‹โ€‹/ nodes that you changed.

I spent some time on this idea and it worked. It basically came down to the correct XMLFilter s chain. Here, as unit test looks, your code will do something like this:

 final SAXParserFactory factory = SAXParserFactory.newInstance(); final SAXParser parser = factory.newSAXParser(); final RecorderProxy recorder = new RecorderProxy(parser.getXMLReader()); final ClipHolder clipHolder = new ClipHolder(recorder); clipHolder.parse(new InputSource(new StringReader(srcXml))); assertTrue(recorder.hasRecordingToReplay()); final Clip clip = clipHolder.getClip(); assertNotNull(clip); assertEquals(clip.title, "My Title"); assertEquals(clip.category, "Blah!"); assertEquals(clip.date, Clip.DATE_FORMAT.parse("12/24/2012")); clip.title = "My Title Updated"; clip.category = "Something else"; final ClipSerializer serializer = new ClipSerializer(recorder); serializer.setClip(clip); final TransformerFactory xsltFactory = TransformerFactory.newInstance(); final Transformer t = xsltFactory.newTransformer(); final StringWriter outXmlBuffer = new StringWriter(); t.transform(new SAXSource(serializer, new InputSource()), new StreamResult(outXmlBuffer)); assertEquals(targetXml, outXmlBuffer.getBuffer().toString()); 

Important lines:

  • your SAX event recorder is wrapped around a SAX analyzer
  • your Clip parser ( ClipHolder ) is wrapped around the recorder
  • when XML is parsed, the recorder writes everything, and your ClipHolder will only look at what it knows about
  • you do what you need to do with the Clip object
  • the serializer is then wrapped around the recorder (basically redirecting it to itself)
  • you then work with the serializer, and it takes care of submitting the recorded events (delegation to parents and registering self as a ContentHandler ) superimposed on what it has to say about the Clip object.

Enter the DVR code and Clip test in github . Hope this helps.

ps is not a general solution, but the whole concept of recording โ†’ playback + overlay is very rudimentary in the implementation provided. The illustration is mostly. If your XML is more complex and gets "hairy" (for example, the same element names at different levels, etc.), then the logic should be complemented. However, the concept will remain the same.

+1
source

You are correct that SAX is probably not the best option if you want to save nodes that you did not "consume". You can still do this using some kind of โ€œsaxophoneโ€ that will save SAX events and play them (there are several implementations of such a thing), but the API based on the object model will be much easier to use: d it's easy to save the full object model and just update your sites.

Of course, you can use the DOM, which is the standard , but you can also consider alternatives that provide easier access to certain nodes that you will use in an arbitrary data model, Among them is the JDOM ( http://www.jdom.org/ ) and XOM ( http://www.xom.nu/ ) are interesting candidates.

+1
source

If you are not tied to a particular xml schema, you should consider doing something like this:

 <data> <element id="title"> myTitle </element> <element id="date"> 18/05/2012 </element> ... </data> 

and then save all these elements in one ArrayList array. This way you wonโ€™t lose the information, and you still have the option to select the item you want to show-edit-etc ...

0
source

Your assumption that XPath is 20 times slower than SAX parsing is erroneous ... SAX parsing is just a low-level tokenizer on which your processing logic will be built ... and your processing logic will require additional analysis. .. XPath performance has a lot to be with the implementation ... As far as I know, vtd-xml XPath is at least an order of magnitude faster than the DOM as a whole, and much better suited for handling XML with heavy loads ... below some links to sitelinks ...

http://sdiwc.us/digitlib/journal_paper.php?paper=00000582.pdf

Android - XPath is rated very slowly

0
source

Source: https://habr.com/ru/post/916068/


All Articles