Using HtmlAgilityPack to parse webpage information in C #

Question

Using HtmlAgilityPack to parse webpage information in C #

I am trying to use HtmlAgilityPack to parse webpage information. This is my code:

using System;
using HtmlAgilityPack;

namespace htmparsing
{
    class MainClass
    {
        public static void Main (string[] args)
        {
            string url = "https://bugs.eclipse.org";
            HtmlWeb web = new HtmlWeb();
            HtmlDocument doc = web.Load(url);
            foreach(HtmlNode node in doc){
                //do something here with "node"
            }               
        }
    }
}

But when I tried to access doc.DocumentElement.SelectNodes, I do not see DocumentElementthe list. I added HtmlAgilityPack.dll to the links, but I don't know what the problem is.

+2

html c # html-agility-pack

star2014 Nov 08 '13 at 23:05

source share

2 answers

, , .

, : http://htmlagilitypack.codeplex.com/SourceControl/latest#Release/1_4_0/HtmlAgilityPack/HtmlNode.cs.

, xpath. xpath //, , . , .

+1

smartcaveman 08 . '13 23:14

Ashad Shanto · Accepted Answer · 2013-11-09T02:10:09+0000

I have an article that demonstrates how to clean DOM elements using HAP (HTML Agility Pack) using ASP.NET. It just allows you to go through the whole process step by step. You can watch and try.

Scramble HTML DOM elements with HtmlAgilityPack (HAP) in ASP.NET

. , .

string url = "https://www.google.com";
HtmlWeb web = new HtmlWeb();
HtmlDocument doc = web.Load(url);
foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//a")) 
{
    outputLabel.Text += node.InnerHtml;
}

, . , DocumentElement HtmlDocument, DocumentNode. HTMLAgilityPack , .

HTMLDocument.DocumentElement

Using HtmlAgilityPack to parse webpage information in C #

, , .

More articles: