Solr single index vs Solr multi core

I need help creating a single index in one instance of Solr and creating multiple cores in one instance of Solr, with each core serving the index. I understand that one index in solr is usually used to index one type of document. What is the best practice when you have different types of documents? For example, if you want to index invoice transaction information, you can create a schema with fields for an invoice transaction document as follows;

  • Invoicedate
  • DueDate
  • invoiceSummary
  • billingContact
  • invoiceLineItems
  • notes

Suppose you also want to index product details, you would create a new document type with a schema as follows:

  • Productcode
  • productDescription
  • sellingPrice
  • purchase price
  • Onhand
  • avgCost
  • notes

and create a new core in Solr to index product documents? Or you combine the transaction and the product into one schema as follows:

  • Invoicedate
  • DueDate
  • invoiceSummary
  • billingContact
  • invoiceLineItems
  • Productcode
  • productDescription
  • sellingPrice
  • purchase price
  • Onhand
  • avgCost
  • notes

and have only one basic indexing of the aforementioned eyepiece instead of having the core β€œAccount” and β€œProduct”, indexing two different documents?

I think it makes sense to have a single flat index, as suggested in the Solr wiki , when the fields are similar, however, in the example, as shown above, the data is not even remotely related to each other, since they are separate objects. I have seen cases where people suggested adding an extra field to distinguish between different objects, such as a table name field or similar, and filtering the query based on the table name field, which I think works. I'm not sure how scalable this is if you have a use case, as described below:

"Search for invoices for the keyword" John ", the search fields are" billingContact "," invoiceSummary "," notes ". Increase the field" billingContact "at the time of the request. Also find the product for" John ", search fields for" ProductDescription "," Vendor "," Notes ". Increase" Vendor "at the time of request. Return only 100 invoices and 100 products."

The application I'm working on requires a search on accounts and products from one form. There are no different parts in the application that are looking for different things.

My fears are to put everything in one index;

1) Large index size, for example: 50 million invoices + 50 million products in a single index

2) Re-indexing an index of this size.

3) Index tuning: wouldn't it be easier to tune / tune each individual index to serve certain expected search results, rather than trying to do it in one index?

4) We also recommend indexing your billing contact information in the future. Which will add more fields for indexing and will contribute to my problems in points 1) and 2).

+6
source share
1 answer

Returns only 100 accounts and 100 products.

and

Increase the "billingContact" field during the request; Increase the "provider" during the request.

This suggests that even if you are looking for the same terms, you are looking for them as separate concepts.

Based on this and the lack of common fields, I would recommend starting with separate collections.

0
source

Source: https://habr.com/ru/post/956835/


All Articles