Java DOM: how to get the number of children

I have an XML document:

<entities xmlns="urn:yahoo:cap"> <entity score="0.988"> <text end="4" endchar="4" start="0" startchar="0">Messi</text> <wiki_url>http://en.wikipedia.com/wiki/Lionel_Messi</wiki_url> <types> <type region="us">/person</type> </types> </entity> </entities> 

I have a TreeMap<String,String> data that stores getTextContent() for the "text" and "wiki_url" . Some "entity" will only have a "text" element (no "wiki_url" ), so I need to find out when there is only a text element as a child and when there is a "wiki_url" . I could use document.getElementByTag("text") and document.getElementByTag("wiki_url") , but then I would lose the connection between the text and the URL.

I am trying to get the number of elements in an "entity" element using:

 NodeList entities = document.getElementsByTagName("entity"); //List of all the entity nodes int nchild; //Number of children System.out.println("Number of entities: "+ entities.getLength()); //Prints 1 as expected nchild=entities.item(0).getChildNodes().getLength(); //Returns 7 

However, as shown above, this returns 7 (which I don’t understand, of course, its 3 or 4 if you included the grandson) Then I was going to use the number of children to scroll through them all to check if getNodeName().equals("wiki_url") and save it to the data, if correct.

Why do I get the number of children as 7 when I can only count 3 children and 1 grandson?

+4
source share
1 answer

White spaces following > of <entity score="0.988"> are also counted for nodes, just like the end of a chararcter line between tags is also analyzed by nodes. If you are interested in a specific node with a name, add a helper method as shown below and call wherever you want.

 Node getChild(final NodeList list, final String name) { for (int i = 0; i < list.getLength(); i++) { final Node node = list.item(i); if (name.equals(node.getNodeName())) { return node; } } return null; } 

and call

 final NodeList childNodes = entities.item(0).getChildNodes(); final Node textNode = getChild(childNodes, "text"); final Node wikiUrlNode = getChild(childNodes, "wiki_url"); 

Usually, when working with the DOM, helper methods like the ones described above come in to simplify the basic processing logic.

+3
source

Source: https://habr.com/ru/post/1487905/


All Articles