Parsing a large XML file with SAX parser, the class becomes bloated and unreadable. How to fix it?

This is purely a matter of reading code; class performance is not a problem.

This is how I create this XMLHandler:

For every element that is related to the application, I have a boolean in'ElementName 'that I set to true or false depending on my location during parsing: problem: now I have a 10+ boolean declaration at the beginning of my class and he is getting bigger and bigger.

In my startElement element and in my endElement method, I have hundreds of lines

if (qName = "elementName") {
   ...
} else if (qName = "anotherElementName") {
   ...
}

with various parsing rules in them (if I am in this position in the XML file, do it, otherwise do it, etc.)

The coding of new parsing and debugging rules is becoming increasingly painful.

What are the best methods for encoding a sax parser and what can I do to make my code more readable?

+3
source share
3 answers

What are you using booleans for? To track nesting?

I recently implemented this using an enumeration for each element. The code works, but this is a rough approximation of this from the top of the head:

enum Element {
   // special markers:
   ROOT,
   DONT_CARE,

   // Element               tag                  parents
   RootElement(             "root"               ROOT),
   AnElement(               "anelement"),     // DONT_CARE
   AnotherElement(          "anotherelement"),// DONT_CARE
   AChild(                  "child",             AnElement),
   AnotherChild(            "child",             AnotherElement);

   Element() {...}
   Element(String tag, Element ... parents) {...}
}

class MySaxParser extends DefaultHandler {
    Map<Pair<Element, String>, Element> elementMap = buildElementMap();
    LinkedList<Element> nestingStack = new LinkedList<Element>();

    public void startElement(String namespaceURI, String sName, String qName, Attributes attrs) {
        Element parent = nestingStack.isEmpty() ? ROOT : nestingStack.lastElement();
        Element element = elementMap.get(pair(parent, sName));
        if (element == null)
            element = elementMap.get(DONT_CARE, sName);
        if (element == null)
            throw new IllegalStateException("I did not expect <" + sName + "> in this context");

        nestingStack.addLast(element);

        switch (element) {
        case RootElement: ... // Probably don't need cases for many elements at start unless we have attributes
        case AnElement: ...
        case AnotherElement: ...
        case AChild: ...
        case AnotherChild: ...
        default: // Most cases here. Generally nothing to do on startElement
        }
    }
    public void endElement(String namespaceURI, String sName, String qName) {
        // Similar to startElement() but most switch cases do something with the data.
        Element element = nestingStack.removeLast();
        if (!element.tag.equals(sName)) throw IllegalStateException();
        switch (element) {
           ...
        }
    }

    // Construct the structure map from the parent information.
    private Map<Pair<Element, String>, Element> buildElementMap() {
        Map<Pair<Element, String>, Element> result = new LinkedHashMap<Pair<Element, String>, Element>();
        for (Element element: Element.values()) {
            if (element.tag == null) continue;
            if (element.parents.length == 0)
                result.put(pair(DONT_CARE, element.tag), element);
            else for (Element parent: element.parents) {
                result.put(pair(parent, element.tag), element);
            }
        }
        return result;
    }
    // Convenience method to avoid the need for using "new Pair()" with verbose Type parameters 
    private <A,B> Pair<A,B> pair(A a, B b) {
        return new Pair<A, B>(a, b);
    }
    // A simple Pair class, just for completeness.  Better to use an existing implementation.
    private static class Pair<A,B> {
        final A a;
        final B b;
        Pair(A a, B b){ this.a = a; this.b = b;}
        public boolean equals(Object o) {...};
        public int hashCode() {...};
    }
}

Edit:
XML . startElement, Element 1) 2) , sName, , , Element. Pair 2 .

-, XML , Element. :

<root>
  <anelement>
    <child>Data pertaining to child of anelement</child>
  </anelement>      
  <anotherelement>
    <child>Data pertaining to child of anotherelement</child>
  </anotherelement>
</root>

, , , <child> . Element , .

+2

XML. ( ) "", :

interface Command {
   public void assemble(Attributes attr, MyStructure myStructure);
}
...

Map<String, Command> commands= new HashMap<String, Command>();
...
if(commands.contains(qName)) {
   commands.get(qname).assemble(attr, myStructur);
} else {
   //unknown qName
}
0

I would give up on JAXB or something similar and let the framework do the work.

0
source

Source: https://habr.com/ru/post/1761697/


All Articles