How to get TEXT_NODE tag in java org.w3c.dom.Node

The documentation for this interface states that text files return "#text" for their names instead of the actual tag name. But for what I am doing, a tag name is necessary.

// I'm using the following imports import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import org.w3c.dom.Document; import org.w3c.dom.NamedNodeMap; import org.w3c.dom.Node; import org.w3c.dom.NodeList; import org.xml.sax.EntityResolver; import org.xml.sax.InputSource; // In the .xml input file <country>US</country> // This is a "text node" .getTextContent() // returns "US", I need "country" and .getNodeName() // only returns "#text" 

How do I access a tag name? It should be possible somehow, I am not against the hacker decision.

Docs:

http://www.w3schools.com/dom/dom_nodetype.asp

http://www.w3.org/2003/01/dom2-javadoc/org/w3c/dom/Node.html

Thanks.

+6
source share
1 answer

I think you misunderstood which nodes are involved. This XML:

 <country>US</country> 

... contains two nodes:

  • country element
  • Text node, with US content

The element is not node text, and node text does not have an element name, because it is not an element. It is important to understand that these are different nodes. I believe that the source of all your confusion.

If you are viewing node text, you can use node.getParentNode().getNodeName() to get the name of the element. Or from a node element, you can call getTextContent() .

+14
source

Source: https://habr.com/ru/post/950760/


All Articles