How to search in HTML file for some tags?

I have a problem with Java. How to do it: I want to look for href and src tags in an HTML file, and then I want to get the URL associated with these tags.

What is the best way to do this?

Thanks for the help. Regards.

+3
source share
7 answers

This is the code that I used to do exactly what you would like to do, but first let me give you some tips.

Java Swing, javax.swing.text.html javax.swing.text.html.parser. , JEditorPane, .

Java 6 API HTML.Tag, HTML, , . http://java.sun.com/javase/6/docs/api/javax/swing/text/html/HTML.Tag.html

, , 3 :

public void handleStartTag(HTML.Tag t, MUtableAttributeSet atts, int pos)
public void handleEndTag(HTML.Tag t, int pos)
public void handleText(char[] text, int pos)

, , , URL-, , URL-.

URL-, JEditorPane. javax.swing.event.HyperlinkListener hyperlinkUpdate (HyperlinkEvent e), URL- .setPage(evt.getURL()) JEditorPane. .

, - , , !

+1

xhtml, xml, bast jdom. JDom .

html-, htmlparser, LinkTag.

0

:

, , JTidy

0

Rhino, html . , getElementBy, node .

0

tagsoup, DOM HTML, .

XPath NodeList, :

//

//IMG

0

Neko HTML Parser ( ).

import org.cyberneko.html.parsers.DOMParser;
import org.w3c.dom.Node;

public class TestParser {

     public static void main(String[] argv) throws Exception {
          DOMParser parser = new DOMParser();
          for (int i = 0; i 
0
source

Source: https://habr.com/ru/post/1705168/


All Articles