How to use the HTML Agility Pack to validate HTML

I am using HTML Agility Pack to validate my html. Below i use

public class MarkupErrors { public string ErrorCode { get; set; } public string ErrorReason { get; set; } } public static List<MarkupErrors> IsMarkupValid(string html) { var document = new HtmlAgilityPack.HtmlDocument(); document.OptionFixNestedTags = true; document.LoadHtml(html); var parserErrors = new List<MarkupErrors>(); foreach(var error in document.ParseErrors) { parserErrors.Add(new MarkupErrors { ErrorCode = error.Code.ToString(), ErrorReason = error.Reason }); } return parserErrors; } 

So my input is something like the one below:

 <h1>Test</h1> Hello World</h2> <h3>Missing close h3 tag 

So my current function returns a list of the following errors

 - Start tag <h2> was not found - End tag </h3> was not found 

which is great ...

My problem is that I want all html to be valid, i.e. with <head> and <body> tags, because this html will be available for preview later, upload it as .html files.

So, I was wondering if I can verify this with the Agility Pack?

Any ideas or other options will be appreciated. Thanks

+4
source share
1 answer

You can check if there is a HEAD element or a BODY element under an HTML element like this, for example:

 bool hasHead = doc.DocumentNode.SelectSingleNode("html/head") != null; bool hasBody = doc.DocumentNode.SelectSingleNode("html/body") != null; 

They will not work if there is no HTML element or no BODY element in the HTML element.

Note. I do not use this XPATH expression "//head" because it will give a result even if the head was not directly below the HTML element.

+4
source

Source: https://habr.com/ru/post/1481718/


All Articles