Xml parsing issue

I get text from an XML file

URL url_Twitter = new URL("http://twitter.com/statuses/user_timelineID_PROVA.rss"); 
HttpURLConnection conn_Twitter =(HttpURLConnection)url_Twitter.openConnection();   

DocumentBuilderFactory documentBF_Twitter = DocumentBuilderFactory.newInstance();            
DocumentBuilder documentB_Twitter = documentBF_Twitter.newDocumentBuilder();    
Document document_Twitter = documentB_Twitter.parse( conn_Twitter.getInputStream());  

there are some characters in xml like & # 8217; so when i call

document_Twitter.getElementsByTagName("title").item(2).getFirstChild().getNodeValue()

the string is truncated before such characters

Text is in just one tag.

  <item>
    <title>SMWRME: Internet per &#8220;Collaborare senza confini&#8221;. Soprattutto alla SMW di Roma, dal 7 all'11 febbraio. Ecco il terzo percorso. http://cot.ag/ewnJ4F</title>
    <description>SMWRME: Internet per &#8220;Collaborare senza confini&#8221;. Soprattutto alla SMW di Roma, dal 7 all'11 febbraio. Ecco il terzo percorso. http://cot.ag/ewnJ4F</description>
    <pubDate>Mon, 27 Dec 2010 20:05:01 +0000</pubDate>
    <guid>http://twitter.com/SMWRME/statuses/19483914259140609</guid>
    <link>http://twitter.com/SMWRME/statuses/19483914259140609</link>
    <twitter:source>&lt;a href=&quot;http://cotweet.com/?utm_source=sp1&quot; rel=&quot;nofollow&quot;&gt;CoTweet&lt;/a&gt;</twitter:source>
    <twitter:place/>
  </item>

I noticed that this behavior only happens for an Android app. The same code is great for a Java application. Can anybody help me?

+3
source share
1 answer

Can you try document_Twitter.getElementsByTagName("title").item(2).getTextContent()instead? In fact, there may be several text nodes below this node, for example

- "item" element
  - "title" element
    - text node "SMWRME: Internet per "
    - text node "&#8220;"
    - text node "Collaborare senza confini"
    - text node "&#8221;"

SAX , , DOM . getTextContent , .

setCoalescing (true) DocumentBuilderFactory DocumentBuilder, , CDATA, .

+1

Source: https://habr.com/ru/post/1786625/


All Articles