I am analyzing an html file using the hmml agility package to extract table data from an html file. But there are some html files where there are no end tags that are optional or no start tag, which is also optional. So the html flexibility package does not parse this html page correctly. If I open the contents of this html file in notepad ++ then with the option TestFX-->TestFX HTML Tidy-->TiDy clean documentand make the contents sorted like this. And now this file. If I parse the hmml agility package, then it parse it correctly.
Creating a html page neat with notepad ++ is the best option.
Therefore, I do not know, but the user cannot do this, as he / she first makes the page neat with Notepad ++, and then go ahead. Then what should I do?
EDIT I used the html tidy pack, but in some cases there is a file that is matched and this is not parsed, but if I make this page neat in notepad ++, then it will be parsed.
source
share