Creating a Web Crawler - Using Webkit Packages

Question

Creating a Web Crawler - Using Webkit Packages

I am trying to create a web crawler.
I need 2 things:

Convert HTML to a DOM object.
Run existing JavaScripts on demand.

The result that I am expecting is a DOM object where the JavaScript executing the load is already executed.
In addition, I need an option to execute additional JavaScripts upon request (in such cases as: onMouseOver , onMouseClick , etc.) First of all, I could not find a good source of documentation.
I looked at the Webkit Homepage , but could not find much information for the users of the package and examples of useful examples. In addition, in some forums I saw instructions not to use the Webkit interface for scanners, but directly the internal DOM and Javascript packages. I am looking for Documentation and Code Examples .
In addition, any recommendations for proper use.

Workspace:

OS: Windows
Lang: C ++

+4

javascript web-crawler webkit dom-manipulation

Kreich Oct 2 '08 at 13:12

source share

1 answer

Ben · Answer 1 · 2008-12-22T19:01:09+0000

Check out some of the testing tools packaged with the WebKit boot. Most ports (as far as I know) include DumpRenderTree, which creates an instance of WebKitView and then pops up the render tree after processing the specified file. Theoretically, this is one of the simplest examples of WebKit.

Creating a Web Crawler - Using Webkit Packages

More articles: