How to make inverted index search faster?

I am designing a full-text search architecture. One of the points is query processing among large data sets with a short response time. The only thing I could understand was to partition the inverted index. There are 2 strategies for this: a term-based partition and a document-based partition. But I really want to know if there is any other way to speed up the upside-down search among large datasets?

+4
source share
1 answer

This video is a speech with Shay Banon, the developer of ElasticSearch, a distributed full-text search engine. In the video, he discusses the pros and cons of the terminal section and the document-based section.

Basically, a term-based partition creates too much network bandwidth between processes / nodes. And it’s harder to implement beautifully. Document-based is extremely simplified to implement and produce results.

In addition, in this lecture by Jeffrey Dean , he also explains the differences and says that Google uses document-based separation.

These are the two main ways to distribute your search engine. I do not know other ways to do this. In any case, you may want to find informational search literature for new works on this subject.

+8
source

Source: https://habr.com/ru/post/1393097/


All Articles