How to automatically index data in solr from a database

I have a MySql base for my application. I implemented a solr search and used dataimporthandler (DIH) to index the data from the database into solr. my question is: is there a way that if the database is updated, then my solr indexes will automatically get the update for new data added to the database. . This means that I don’t need to start the index process manually every time the database tables change. If so, then tell me how I can achieve this.

+6
source share
4 answers

I don’t think there is a feature in Solr that allows you to index data when any updates happen to the database.

But there may be such possibilities as using Triggers - it is possible to launch an external application from triggers.

Write CRON to run a PHP script that reads from the database and indexes it in Solr . Write down the trigger (which calls this script) for the CRUD operation and upload it to the database, therefore, whenever something happens to the database, this trigger is called above the script, and indexing can happen.

Please look:

PHP script call from MySQL trigger

Auto Schedule:

Please see this post How can I plan to import data into Solr for more information on planning. The second answer explains how to import using Cron .

+4
source

Since you used DataImportHandler to load your data into Solr for the first time ... You can create a Delta migration handler that runs with curl from a cron job to periodically add changes to the database to the index. Also, if you need more real-time updates, as @Rakesh suggested, you can use a trigger in your database and run this call to freeze Delta DIH.

+1
source

You can import data using your browser and task manager. follow these steps on the Windows server ... Go to Administrative Tools => Task Schedule Click Create Task

Now the Create Task screen will be opened using TAB General, Triggers, Actions, Settings.

On the genral tab, enter the name of the task "Solrdataimport" and in the description enter "Import mysql data"

Now go to the "Triggers" tab. CLick new in the setting "Check" Daily.In "Advanced setup". Repeat the task each ... Put the time there what you want. Click OK.

Now go to the "Actions" button, click the "New button" to install the program / Script "C: \ Program Files (x86) \ Google \ Chrome \ Application \ chrome.exe" is the way to install the chrome browser. Add arguments enter http: // localhost: 8983 / solr / # / collection1 / dataimport // dataimport? Command = full-import & clean = true And click OK

Using all of the above process Data will be imported automatically. If the Imort process stops, follow all the processes described above, just change the / Script "taskkill" program instead of "C: \ Program Files (x86) \ Google \ Chrome \ Application \ chrome.exe" in the "Actions" section. Enter "f in the arguments" / im chrome.exe "

Set trigger times according to requirements

+1
source

What you are looking for is delta import, and many other posts contain information about this. I created a Windows WPF application and service to issue Solr commands in a recurring schedule, since using CRON jobs and the task scheduler is a little difficult to maintain if you have many cores / environments.

https://github.com/systemidx/SolrScheduler

Basically, you just drop the JSON file into the specified folder, and it will use the REST client to issue Solr commands.

0
source

Source: https://habr.com/ru/post/891264/


All Articles