from my JSoup '...">

How to remove unused spaces from a JSoup document?

How can I remove them:

<td>&nbsp;</td> 

or

 <td width="7%">&nbsp;</td> 

from my JSoup 'document? I tried many methods, but these inextricable whitespace characters do not match any normal JSoup expressions or selectors.

+6
source share
1 answer

HTML Object &nbsp; ( Unicode character NO-BREAK SPACE U + 00A0 ) in Java can be represented by the character \u00a0 . Assuming that you want to remove every element that contains this character as its own text (and therefore not every line, as you said in the comment), the following should work:

 document.select(":containsOwn(\u00a0)").remove(); 

If you really want to delete the entire line, it is best to scan the HTML one at a time.

+12
source

Source: https://habr.com/ru/post/894892/


All Articles