Regex for HTML with java.util.regex
I need a regex for the following html:
<div xmlns="http://www.w3.org/1999/xhtml"> <p/>
<p/><p/> <p/>
</div>
This comes from the richtext field and obviously this is not meaningful content or means: empty. I cannot say in java: if (richTextConent == null || richTextContent.length == 0) because the richtext field contains something. Semantically, the content above is empty, so I thought about using a regex. I need to map this snippet to java.util.regex
If the fragment has something meaningful:
<div xmlns="http://www.w3.org/1999/xhtml"> text<p/>
<p/><p/>text <p/>
</div>
than regex should not match.
Use an HTML parser like Jsoup .
String html1 = "<div xmlns=\"http://www.w3.org/1999/xhtml\"> <p/> <p/><p/> <p/></div>";
String html2 = "<div xmlns=\"http://www.w3.org/1999/xhtml\"> text<p/> <p/><p/>text <p/> </div>";
System.out.println(Jsoup.parse(html1).text().isEmpty()); // true
System.out.println(Jsoup.parse(html2).text().isEmpty()); // false