Solr and .Net Filters

Question

Solr and .Net Filters

I am relatively new to the wonderful world of Solra and ask the following question. What is the best way to process documents in terms of extracting the structure of the document and transferring it to Solr for indexing.

I would like to be able to extract text from Word Docs, PDF's, Spreadsheets, HTML pages, etc. In fact, almost any document containing text.

I took a look at Windows Filters, and at first glance they seem to provide the required functions.

How do you do this?

sime

+3

# c .net solr solrnet

Alan simes Sep 22 '10 at 13:17

source share

2 answers

Philip Rieck · Answer 1 · 2010-09-22T13:32:50+0000

, Solr Cell. , #, , , / java-.

Solr Cell Apache Tika, , ( ) , Word PDF.

Mauricio Scheffer · Answer 2 · 2010-09-22T14:54:08+0000

, SolrCell . , - SolrNet, :

,
, HTTP- Solr, SolrNet .

, iTextSharp/Aspose SolrCell - .

Solr and .Net Filters

More articles: