Solr / Solrj pagination

I use solr and solrj to index and search the web application that I am creating. My request handler is configured in the solrconfig.xml file as follows:

<requestHandler name="/select" class="solr.SearchHandler"> <lst name="defaults"> <str name="echoParams">explicit</str> <str name="start">0</str> <int name="rows">10</int> <str name="defType">edismax</str> <str name="qf"> title^10.0 subtitle^7.0 abstract^5.0 content^1.0 text^1.0 </str> <str name="pf"> title^10.0 subtitle^7.0 abstract^5.0 content^1.0 text^1.0 </str> <str name="df">text</str> </lst> </requestHandler> 

In a way, indexing and searching work well. However, I want to implement pagination. The configuration file contains the data "start" and "row". However, in solrj, when I run:

 SolrQuery query = new SolrQuery(searchTerm); System.out.println(query.getRequestHandler()); System.out.println(query.getRows()); System.out.println(query.getStart()); 

Three print statements show null. I understand that each of these “gets” corresponds to a “set”, but I would assume that they will already be installed through the response handler in the solrconfig.xml file. Can someone tell me?

+4
source share
2 answers

Before executing the request on the server, the client will not know what you installed on the server side, right? Therefore, it is not surprising that they are all zero.

To implement pagination, you need two parameters on the client side - the page number and the number of elements on the page. After you acquire these two, you can build your client-side SolrQuery as follows:

 SolrQuery query = new SolrQuery(searchTerm); query.setStart((pageNum - 1) * numItemsPerPage); query.setRows(numItemsPerPage); // execute the query on the server and get results QueryResponse res = solrServer.query(solrQuery); 
+6
source

As @arun said in his answer, "the client would not know what you installed on the server side." Therefore, do not be surprised that they are empty. On the other hand, I would warn you about pagination problems that may occur in some situations.

Pagination is a simple thing when you have few documents to read, and all you have to do is play with the start and rows options.

So, for a client who needs 50 results per page, page # 1 is requested using start = 0 & rows = 50. Page # 2 - start = 50 and rows = 50, page No. 3 - start = 100 & rows = 50 etc. .d. But in order for Solr to know which 50 documents to return, starting from an arbitrary point N, it is necessary to create an internal queue of the first N + 50 sorted documents that match the query, so that he can throw out the first N documents and return the remaining 50. This means that the amount of memory needed to return paginated results grows linearly with an increase in the initial parameter.

Therefore, if you have many documents, I mean hundreds of thousands or even millions, this is not feasible.
This is what can bring your solr server to its knees.

For typical applications displaying search results for the user, this does not tend to be a big problem, since most users do not care about drilling the last first page of search results pages - but for automated systems that want to crunch data for all documents matching the query, there may be seriously prohibitive.

This means that if you have a website and swap search results, the real user does not go that far, but on the other hand, think about what might happen if a spider or scraper tries to read all the pages of the website. Now we are talking about Deep Paging.

I suggest reading this amazing post:

https://lucidworks.com/blog/2013/12/12/coming-soon-to-solr-efficient-cursor-based-iteration-of-large-result-sets/

And take a look at this page of the document:

https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results

And here is an example that tries to explain how to draw pages with cursors.

 SolrQuery solrQuery = new SolrQuery(); solrQuery.setRows(500); solrQuery.setQuery("*:*"); solrQuery.addSort("id", ORDER.asc); // Pay attention to this line String cursorMark = CursorMarkParams.CURSOR_MARK_START; boolean done = false; while (!done) { solrQuery.set(CursorMarkParams.CURSOR_MARK_PARAM, cursorMark); QueryResponse rsp = solrClient.query(solrQuery); String nextCursorMark = rsp.getNextCursorMark(); for (SolrDocument d : rsp.getResults()) { ... } if (cursorMark.equals(nextCursorMark)) { done = true; } cursorMark = nextCursorMark; } 
+4
source

Source: https://habr.com/ru/post/1485112/


All Articles