Mouse Integration + mysql

When nutch finishes its cycle (i.e. scanning - fetching - index) during the index phase, I don't want to index the index (lucene index), but I want nutch to place all the scan data (I believe that it holds them as a NutchDocument object ) in mysql using my code.

Is there any way to do this?

thank

+1
source share
1 answer

Create your own java class that manages the Nutch loop. It should look like org.apache.nutch.crawl.Crawl, but you will need to replace the indexer call with the call to your Mysql connector. Or, you can call your Mysql connector during each cycle, depending on whether you want to update Mysql at the end of the crawl or during its execution.

+4
source

Source: https://habr.com/ru/post/1785066/


All Articles