I am parsing XML documents. I am doing getTextContent() to get the text from the specific section that I want. The text that I get has tags like
<italic> </italic> <sub> </sub>
.. and a few more. I want to break these tags and just save the text, no matter what the tags are.
My document is as follows
<article> <sec>Section 1</sec> <sec>Section 2 <title>Title1</title> <sec> <title>Subtitle1</title> <p>........<italic> </italic>...</p> </sec> <sec> <title>Subtitle2</title> <p>........<sub> </sub>...</p> </sec> </sec> </article>
I need all the text in <p>...</p> without tags in it. How can i do this? I was thinking about identifying all tags and replacing them with "" . But there must be a better way.
thanks
source share