Link to HTML Agility Pack Null

I have problems with the Agility Pack.

I get an exception with an empty link when I use this method in HTML that does not contain a specific node. At first he worked, but then he stopped working. This is just a snippet and there are about 10 more foreach loops that select different nodes.

What am I doing wrong?

public string Export(string html) { var doc = new HtmlDocument(); doc.LoadHtml(html); // exception gets thrown on below line foreach (var repeater in doc.DocumentNode.SelectNodes("//table[@class='mceRepeater']")) { if (repeater != null) { repeater.Name = "editor:repeater"; repeater.Attributes.RemoveAll(); } } var sw = new StringWriter(); doc.Save(sw); sw.Flush(); return sw.ToString(); } 
+6
source share
5 answers

AFAIK, DocumentNode.SelectNodes can return null if no nodes are found.

This is the default behavior, see the discussion section on codeplex: Why DocumentNode.SelectNodes returns null

Thus, a workaround might be to rewrite the foreach block:

 var repeaters = doc.DocumentNode.SelectNodes("//table[@class='mceRepeater']"); if (repeaters != null) { foreach (var repeater in repeaters) { if (repeater != null) { repeater.Name = "editor:repeater"; repeater.Attributes.RemoveAll(); } } } 
+21
source

As the answer is Alex, but I solved it as follows:

 public static class HtmlAgilityPackExtensions { public static HtmlAgilityPack.HtmlNodeCollection SafeSelectNodes(this HtmlAgilityPack.HtmlNode node, string selector) { return (node.SelectNodes(selector) ?? new HtmlAgilityPack.HtmlNodeCollection(node)); } } 
+1
source

Are you adding simple ? before each example . gets hit:

 var titleTag = htdoc?.DocumentNode?.Descendants("title")?.FirstOrDefault()?.InnerText; 
+1
source

I created a universal extension that will work with any IEnumerable<T>

 public static List<TSource> ToListOrEmpty<TSource>(this IEnumerable<TSource> source) { return source == null ? new List<TSource>() : source.ToList(); } 

And used:

 var opnodes = bodyNode.Descendants("o:p").ToListOrEmpty(); opnodes.ForEach(x => x.Remove()); 
0
source

This update has been updated, and now you can prevent SelectNodes from returning null by setting doc.OptionEmptyCollection = true as described in this github issue .

This will force it to return an empty collection instead of zero if there are no nodes that match the query (I'm not sure why this was not the default behavior to start with).

0
source

Source: https://habr.com/ru/post/886779/


All Articles