Html string reader

I need to load HTML and parse it, I think it should be something simple, I pass the string with "HTML", it reads the string in Dom as an object, so I can search and parse the HTML content, making it easy to crawl and the like .

You guys know about that.

thanks

+4
source share
2 answers

HTML Agility Pack

Similar API for XmlDocument , for example (on the example page):

  HtmlDocument doc = new HtmlDocument(); doc.Load("file.htm"); foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href"]) { HtmlAttribute att = link["href"]; att.Value = FixLink(att); } doc.Save("file.htm"); 

(you can also use LoadHtml to load the html string, not from the path)

+13
source

If you work in a browser, you can use the Html DOM Bridge, load HTML into it and lay out the DOM tree for this.

+2
source

Source: https://habr.com/ru/post/1307620/


All Articles