Solr and MySQL, How do I keep an updated index, and is DB required if it's simple?

Question

Solr and MySQL, How do I keep an updated index, and is DB required if it's simple?

I'm a complete newbie with Solr, so bear with me. :)

In my current project, I have a very simple database - there is only one table containing 4 fields: id, name, subject, msg.

As I understand it, every time a new record is added (or deleted), I need to add this record to the index, performing basically two operations: inserting the record into the database and adding it to the index.

Is this a standard procedure, or is there a way to redirect Solr to automatically reindex the database table, either at a certain interval, or if there are updates?

In addition, since the table is so simple, does it make sense to store this information in the database? Why not just keep it in the Solr index, given that I want the entries to be searchable by name, subject, and msg?

My setup is Java, Hibernate, MySQL, and Solrj.

+6

database mysql indexing solr solrj

Valera Apr 13 '11 at 14:35

source share

2 answers

The only reason I see the database might be useful is that it has better transaction support. In any case, lucene (the core SOLR engine) can only allow one indexer, so you cannot easily spoil the base record by modifying it at the same time.

As far as I know, you do not need a database. SOLR will handle everything you need.

-1

uncaught_exceptions Apr 13 '11 at 2:43

source share

Bart · Accepted Answer · 2011-04-18T16:59:00+0000

Using a database or not really comes down to how long you want to save and develop this data. It is much easier to decompose the entire Solr index (and lose all your data) than to corrupt the entire database. In addition, Solr does not have much support for schema changes, not starting with the new index. For example, you can add another field just fine, but you cannot change the name or type of a field without deleting your index.

If you are traveling from a database, you can configure Solr to index directly from the database using the DataImportHandler . For your schema, this should be pretty simple, but it can become very painful as your database becomes more complex. I think there is an advantage to using the Hibernate objects that you already have installed, and just embed them with Solrj. Another point of pain with DataImportHandler is that it is completely controlled via http. Therefore, you need to manage individual cron jobs (or some other code) to handle scheduling using wget or curl .

Solr and MySQL, How do I keep an updated index, and is DB required if it's simple?

More articles: