Problems getting XML node text in a StAX XMLStreamConstants.CHARACTERS event

When reading an XML file using StAX and XMLStreamReader, I had a strange problem. Not sure if this is a mistake, or am I doing something wrong. Still learning StAX.

So the problem is that

  • In the event XMLStreamConstants.CHARACTERS, when I collect the text node as XMLStreamReader.getText().
  • If there is a &, <,> character or even something hidden, for example, in the text node, it returns only the first part of the text string. for example ABC & XYZreturns onlyABC

Simplified Java source:

    // Start StaX reader
    XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
    try {
        XMLStreamReader xmlStreamReader = xmlInputFactory.createXMLStreamReader(inStream);
        int event = xmlStreamReader.getEventType();
        while (true) {
            switch (event) {
                case XMLStreamConstants.START_ELEMENT:
                    switch (xmlStreamReader.getLocalName()) {
                        case "group":
                        // Do something
                            break;
                        case "source":
                            isSource = true;
                            break;
                        case "target":
                            isTarget = true;
                            break;
                        default:
                            isSource = false;
                            isTrans = false;
                            break;
                    }
                    break;
                case XMLStreamConstants.CHARACTERS:
                    if (srcData != null) {
                        String srcTrns = xmlStreamReader.getText();
                        if (srcTrns != null) {
                            if (isSource) {
                                // Set source text
                                isSource = false;
                            } else if (isTrans) {
                                // Set target text
                                isTrans = false;
                            }
                        }
                    }
                    break;
                case XMLStreamConstants.END_ELEMENT:
                    if (xmlStreamReader.getLocalName().equals("group")) {
                        // Add to return list
                    }
                    break;
            }
            if (!xmlStreamReader.hasNext()) {
                break;
            }
            event = xmlStreamReader.next();
        }
    } catch (XMLStreamException ex) {
        LOG.log(Level.WARNING, ex.getMessage(), MessageFormat.format("{0} {1}", ex.getCause(), ex.getLocation()));
    }

I'm not quite sure what exactly I am doing wrong or how to assemble the full node text.

Any suggestions or tips would be a great help for further study of StAX. :-)

+4
1

.

. XMLInputFactory IS_COALESCING true

XMLInputFactory.setProperty(XMLInputFactory.IS_COALESCING, true);

( , - ) .

+8

Source: https://habr.com/ru/post/1534393/


All Articles