Solr% 100 Write Capability During Optimization

Question

Solr% 100 Write Capability During Optimization

So here is my dilemma ...

I run a real-time search index with Solr, indexing about 6M documents per day. Documents expire in 7 days. Therefore, every day I add 6M documents and delete 6M documents. Unfortunately, I need to run “optimize” every so often, otherwise I will run out of disk space.

During “optimization,” Solr continues to serve read requests, but write requests are blocked. I have all my records in line, so quickly, everything is in order. However, since my index is so large, “optimization” takes about an hour, and during this hour new updates are not available for reading. Thus, my index is in real time, with the exception of the hour per day, which I optimize. During this time, it seems that the index is an hour behind. This is not optimal.

My current solution is this: write all the data in two Solr indexes, both queues. An alternative “optimize” on two indices every 12 hours. During the "optimization" of index 1, direct all read traffic to index 2 and vice versa. This time-based routing seems rather fragile and sloppy, though.

Is there a better way?

+3

lucene solr

devinfoley Feb 24 '11 at 7:48

source share

4 answers

nikhil500 · Answer 1 · 2011-02-24T10:19:19+0000

According to the comments here and the FAQ here , optimization should not be necessary. Non-optimization may first increase the size of the index, but it should not constantly increase. I suggest you turn off optimization for a few days and control the size of the index.

Brendan Hannemann · Answer 2 · 2012-12-17T19:14:48+0000

Another time-based option is to maintain a separate index for each day and record all indexes every day. In this case, you do not need to do the deletions, and instead you rotate the indexes in a first-in-first-out (FIFO) order.

Index 1 = Day 1 + Day 2 + Day 3 + Day 4 + Day 5 + Day 6 + (no longer used)
Index 2 = empty + Day 2 + Day 3 + Day 4 + Day 5 + Day 6 + Day 7 + (no longer used)
Index 3 = empty + empty + Day 3 + Day 4 + Day 5 + Day 6 + Day 7 + Day 8
...

. 2 1 , 2 .

, , ( 2 1 ..), , , , .

Avi · Answer 3 · 2011-02-24T08:05:27+0000

Have you tried using different mergefactors or a different merge policy? If you are doing constant writing, this may be a better approach than optimization.

developresource · Answer 4 · 2011-02-24T08:14:57+0000

Use replication.

Write to your Master, repeat your slave. Optimization will be performed on your Master and run all queries to the slave.

Solr% 100 Write Capability During Optimization

More articles: