C # library for web search and ftp scanner

Question

C # library for web search and ftp scanner

I need a library (hopefully in C #!) That works as a web crawler for accessing HTTP files and FTP files. Basically, I am happy with reading HTML, I want to expand it to PDF, WORD, etc.

I am pleased with the older open source software, or at least any documentation guidelines.

+3

c # web-crawler

David conde Oct 18 '10 at 18:38

source share

2 answers

I developed the Crawler Engine for the Crawler-Lib platform. This is a full-featured search robot that can easily spread to any queries or even to the processing that you want to have.

Here is the engine: http://www.crawler-lib.net/crawler-lib-engine

Here is a Youtube video showing how the Crawler-Lib engine works: http://www.youtube.com/user/CrawlerLib

I know that this project is not open source, but there is a free version.

+1

Thomas Maierhofer 28 . '13 8:50

Nick martyshchenko · Accepted Answer · 2010-10-18T18:43:03+0000

Check NCrawler project

A simple and highly efficient multithreaded web crawler with pipeline-based processing written in C #. Contains HTML, Text, PDF and IFilter document processors and language definition (Google). Easily add pipeline steps to retrieve, use, and modify information.

C # library for web search and ftp scanner

More articles: