I am writing a simple screen scripting program in C # for which I need to select all the input data placed inside one form with the name "aspnetForm" (there are 2 forms on the page and I donβt want to enter data from another), and all the inputs in this the form is placed inside different tables, divs or only at the first level of this form.
So, I wrote a very simple XPath query:
//form[@id='aspnetForm']//input
It works as expected in all the browsers I tested (Chrome, IE, Firefox) - it returns what I want.
But in HTMLAgilityPack it doesn't work at all - SelectNodes always return NULL.
These queries I wrote for tests work fine, but do not return what I want. First, select all the inputs that are the first parents for my form, and the second only returns the form:
//form[@id='aspnetForm']/input //form[@id='aspnetForm']
Yes, I know that I can just list the nodes from the last query or make another SelectNodes on it, but I really don't want to do this. I want to use the same query as in browsers.
Is XPath currently damaged in HTMLAgilityPack? Are there any XPath alternatives for C #?
UPDATE : test code:
using HtmlAgilityPack; using Microsoft.VisualStudio.TestTools.UnitTesting; namespace HtmlAGPTests { [TestClass] public class XPathTests { private const string html = "<form id=\"aspnetForm\">" + "<input name=\"first\" value=\"first\" />" + "<div>" + "<input name=\"second\" value=\"second\" />" + "</div>" + "</form>"; private static HtmlNode GetHtmlDocumentNode() { var document = new HtmlDocument(); document.LoadHtml(html); return document.DocumentNode; } [TestMethod] public void TwoLevelXpathTest()
c # xpath screen-scraping
rufanov Apr 23 '14 at 0:25 2014-04-23 00:25
source share