HTMLAgilityPack - you need to set the UseIdAttribute property to true to enable this function

I am trying to use HTMLAgilityPack with VS2008 / .Net 3.5. I get this error even if I set the OptionUseIdAttribute parameter to true, although by default it should be true.

Error Message: You need to set UseIdAttribute property to true to enable this feature Stack Trace: at HtmlAgilityPack.HtmlDocument.GetElementbyId(String id) 

I tried versions 1.4.6 and 1.4.0, did not work.

Version 1.4.6 - Net20 / HtmlAgilityPack.dll

Version 1.4.0 - Net20 / HtmlAgilityPack.dll

This is the code

  HtmlWeb web = new HtmlWeb(); HtmlDocument doc = web.Load(url); HtmlNode table = doc.GetElementbyId("tblThreads"); 

That didn't work either

  HtmlWeb web = new HtmlWeb(); HtmlDocument doc = new HtmlDocument { OptionUseIdAttribute = true }; doc = web.Load(url); HtmlNode table = doc.GetElementbyId("tblThreads"); 

How can I fix this problem? Thanks.

+6
source share
1 answer

First I used ILSpy in 1.4.0 HAP Dll. I went over to the HtmlDocument class and saw that the GetElementById method looks like this:

 // HtmlAgilityPack.HtmlDocument /// <summary> /// Gets the HTML node with the specified 'id' attribute value. /// </summary> /// <param name="id">The attribute id to match. May not be null.</param> /// <returns>The HTML node with the matching id or null if not found.</returns> public HtmlNode GetElementbyId(string id) { if (id == null) { throw new ArgumentNullException("id"); } if (this._nodesid == null) { throw new Exception(HtmlDocument.HtmlExceptionUseIdAttributeFalse); } return this._nodesid[id.ToLower()] as HtmlNode; } 

Then I got ILSpy to analyze "_nodesid", because in your case, for some reason, it is not installed. "HtmlDocument.DetectEncoding (TextReader)" and "HtmlDocument.Load (TextReader)" assigns the value "_nodesid".

Therefore, you can try an alternative method for reading content from a URL where the value "_nodesid" will definitely be assigned, for example.

 var doc = new HtmlDocument(); var request = (HttpWebRequest)WebRequest.Create(url); request.Method = "GET"; using (var response = (HttpWebResponse)request.GetResponse()) { using (var stream = response.GetResponseStream()) { doc.Load(stream); } } var table = doc.GetElementbyId("tblThreads"); 

This approach ensures that "HtmlDocument.Load (TextReader)" is called, and in this code I see that _nodesid will definitely be assigned, so this approach may (I have not compiled the code I proposed) work.

+3
source

Source: https://habr.com/ru/post/956267/


All Articles