You can use JSoup to clear the HTML.
String cleaned = Jsoup.clean(html, Whitelist.relaxed());
You can use one of the defined whitelists or create your own custom one, in which you specify which HTML elements you want to allow through the cleaner. Everything else is deleted.
Your specific example:
String html = "one two three <blabla> four <text> five <div class=\"bold\">six</div>"; String cleaned = Jsoup.clean(html, Whitelist.relaxed().addAttributes("div", "class")); System.out.println(cleaned);
Output:
one two three four five <div class="bold"> six </div>
source share