Delete script in jsoup link

I want to remove the script when reading url not file, please help me

  Document connect =  Jsoup.connect("http://www.tutorialspoint.com/ant/ant_deploying_applications.htm");
            Elements selects = connect.select("div.middle-col");
            System.out.println(selects.removeAttr("script").html());
+4
source share
2 answers

Here's how you need to remove the script element:

import java.io.IOException;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

public class TestJsoup {
    public static void main(String args[]) throws IOException {
        Document doc = Jsoup.connect("http://www.tutorialspoint.com/ant/ant_deploying_applications.htm").get();

        Elements selects = doc.select("div.middle-col");
        for (Element script : selects) {
            Elements scripts = script.select("script");
            scripts.remove();
        }   
        System.out.println(selects.html());
    }
}
+5
source

Alternatively, you can use Jsoup.Clean(html,white).

+3
source

Source: https://habr.com/ru/post/1611223/


All Articles