Get HTML table data

I have an HTML table (well, I did not do this, but I use it just to clear this) with many rows and several columns.

I want to get some data into a string for use as a tooltip. The way I do it now is to read the contents of the HTML file as a string and use string manipulations to get the required data.

This is probably a very bad idea, so I was wondering if there is any API that I could use to read text from a specific row and column in an HTML file (for example, column 2 of row 2). I would prefer not to use an external DLL file, but I will have to use it if there is no other way.

Any ideas?

+4
source share
3 answers

HTML Agility Pack

There are some good examples of using the Agility Pack.

Links sent by rtpHarry to this answer

An example from a codeplex site regarding how you could fix all hrefs in an HTML file using the HTML flexibility package:

  HtmlDocument doc = new HtmlDocument(); doc.Load("file.htm"); foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href"]) { HtmlAttribute att = link["href"]; att.Value = FixLink(att); } doc.Save("file.htm"); 
+6
source

One way could be to use a library such as the Html Agility Pack to load an html document, and then use the DOM api or xpath to go to the desired node and get the content. This may get you started with the agility package: How to use the HTML agility package

Finally, if your html is xhtml (or in actual xml form), you can use the xml libraries available in .NET to do the manipulation.

+2
source

Actually, I think the approach you took is a great idea.

Most likely I'll do it. There may be libraries for this, but they will do the same.

It would be better to get the data from the source rather than parse it from an HTML page. But if that's all you have, then what you need to do.

Why do you think this is a bad idea?

0
source

Source: https://habr.com/ru/post/1332191/


All Articles