What are the spatial limits of the Lucene index?

I add Billions rows to the Lucene index, each row is almost 6,000 bytes. Is there a limit on the maximum number of rows that can be added to a Lucene index? How much space does a billion rows of 6000 bytes take on the Lucene index. Is there a limit to this size?

+6
source share
1 answer

See the Lucene documentation for limitations , it cannot contain more

  • ~ 274 billion different terms,
  • ~ 2.1 billion documents.

For such large datasets, it is generally recommended that you use Lucene only for the inverted index and store the actual contents of the documents elsewhere. You can expect that the size of the index will be ~ 30% of the size of the original document (provided that these are ordinary documents, documents with computational code with many unique terms will generate a much larger index).

+7
source

Source: https://habr.com/ru/post/919750/


All Articles