How to manage paging with Solr?

Question

How to manage paging with Solr?

I have a site for ads ... I have Solr that searches for ads and then returns ID: nrs, which I then use to place in an array. Then I use this array to find any declarations in the MySql database where the identifier: s matches the identifier: s in the array returned by Solr.

Now, since this array can be very large (100 thousand records or more), then I will need to “display” the results, so maybe 100 when they returned at a time. And then use the 100 ID: s in MySql to find the ads.

So, is it possible on a page with SOLR?

And if so, how? I need some sample code ... And what will be the results.

Basically I need a detailed example!

thanks

+4

java sql php mysql solr

user188962 Feb 27 '10 at 16:43

source share

5 answers

Paging is controlled using start and rows , for example:

 ?q=something&rows=10&start=20

will give you 10 documents starting from document 20.

About getting other information from MySQL, you yourself. I and other people have already suggested that you save everything in Solr to avoid additional MySQL queries.

+20

Mauricio Scheffer Feb 28 '10 at 17:43

source share

This is probably a bit of an old question and a lot of useful answers and recommendations, but I will try to summarize the results and describe a solution for breaking up large data sets with the cursor , bec. I recently ran into this problem.

Like Yonik mentioned , the problem with regular start / rows is that when we have a large dataset and start little further (much further) than zero, we have good overhead in terms of efficiency and memory. This is due to the fact that retrieving 20 documents from the “average” 500K + records using sorting, at least requires sorting the entire data set (sorting internal unique elements). Moreover, if the search is widespread, it will be even more resource intensive, bec. a dataset (of 500,020 rows) from each shard must be returned to the node aggregator to combine to find the applicable 20 rows.

Solr cannot calculate which corresponding document is the result of 999001st in sorted order, without first determining that the first 999,000 comparable sorted results.

The solution here is to use Solr cursorMark .

In the first query, you declare &cursorMark=* . This means the following:

You might think that this is like start=0 as a way to tell Solr to “start at the beginning of my sorted results”, except that it also tells Solr that you want to use the cursor.

! Here you can describe that your sort clauses should include a uniqueKey field . This field can be id if its unique.

Part of the first request will look like this:

 ?sort=price desc,id asc&start=0&cursorMark=* ...

As a result, you get the following structure

 { "response":{"numFound":20,"start":0,"docs":[ /* docs here */ ]}, "nextCursorMark":"AoIIRPoAAFBX" // Here is cursor mark for next "page" }

To get the next page, the following query would look like this:

 ?sort=price desc,id asc&start=0&cursorMark=AoIIRPoAAFBX ...

Pay attention to cursorMark from the previous answer. And as a result, you get the next page of results (the same structure as the first answer, but with a different value for nextCursorMarker ). Etc...

This approach is ideal for infinite scroll scrolling, but there are some things to think about to use it in classic pagination :).

Here are some reference materials I found to solve this problem, hope this helps someone to do this.

Breakdown of Results
Sorting, paging and Deep Paging in Solr ( Yonik ) (Thank you very much!)
Effective cursor iteration of large result sets

+11

Paul T. Rawkeen Apr 24 '15 at 11:58

source share

The "start" parameter controls the offset in the search results, and the "rows" parameter controls how many documents will be returned from there.

If you perform deep paging (iterating over many pages), you can achieve much greater performance by using the cursor to iterate over a set of results.

+4

Yonik Jan 26 '14 at 20:47

source share

I think it's worth saying that solr returns, together with the current page, the results of counting the total number of records found.

For example, a call:

 http://192.168.0.1:8983/solr/select?qt=edismax&fl=*,score&qf=content^2%20metatag.description^3%20title^5%20metatag.keywords^10&q=something&start=20&rows=10&wt=xml&version=2.2

Answer:

 <response> <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">1</int> <lst name="params"> <str name="fl">*,score</str> <str name="q">something</str> <str name="qf">content^2 metatag.description^3 title^5 metatag.keywords^10</str> <str name="qt">edismax</str> <str name="wt">xml</str> <str name="rows">10</str> <str name="version">2.2</str> </lst> </lst> <result name="response" numFound="1801" start="0" maxScore="0.15953878"> <doc>...</doc> <doc>...</doc> <doc>...</doc> ...

Using solrj, a method request returns a SolrDocumentList, which has a method: getNumFound ().

+1

Marco altieri Apr 14 '13 at 1:00

source share

jasonbar · Accepted Answer · 2010-02-28T01:44:23+0000

Take a look at IBM . Perhaps this will lead you to the right course.

Number of results: sets the maximum number of returned results.
Start: Offset to start in the result set. This is useful for pagination.

So you probably want to change to

<str name="rows">10</str> <str name="start">0</str>

Your solr client should provide some way to get the total number of results without any problems.

How to manage paging with Solr?

More articles: