Creating a DOM document using tagoup

I can not get TagSoup to work. I use the following code, but when I print the Node returned by the parser (line with System.err.println (doc);), I always get "[#document: null]".

I don’t know how to find the error in this code or, depending on what it is, the origin of the problem. Please, help!

public final Document parseDOM(final File fileToParse) {
  Parser p = new Parser();
  SAX2DOM sax2dom = null;
  org.w3c.dom.Node doc  = null;

  try { 

        URL url = new URL("http://stackoverflow.com/");
        p.setFeature(Parser.namespacesFeature, false);
        p.setFeature(Parser.namespacePrefixesFeature, false);
        sax2dom = new SAX2DOM();
        p.setContentHandler(sax2dom);
        p.parse(new InputSource(new InputStreamReader(url.openStream())));
        doc = sax2dom.getDOM();
        System.err.println(doc);
  } catch (Exception e) {
     // TODO handle exception
     e.printStackTrace();
  }


  return doc.getOwnerDocument();
 }
+3
source share
2 answers

From the documentation for getOwnerDocument:

If this node is a document or DocumentType that is not yet used with any document, this value is null.

getDOM Document, doc Document.

+3

, node. node - XML- :

          Writer out = new StringWriter();
          XMLSerializer serializer = new XMLSerializer(out, new OutputFormat());
          serializer.serialize(doc);
          System.out.println(out.toString());
+1

Source: https://habr.com/ru/post/1772693/


All Articles