Elasticsearch breakdown assessment based on occurrence

I am trying to find a way to prevent multiple posts from appearing in search results that belong to the same author. So far I have tried random scoring, which allows me to maintain pagination. However, I can still have up to 4 of the same authors on this page out of 10 results.

Is there a way to clog a document based on how many times a particular field occurs in the result set? As far as I know, you cannot save a variable or object in script brackets.

I examined several ways to achieve this, but many of them have many disadvantages. For example, removing duplicates and calling again to get a new set of results that are excluded from the list of authors. However, this may also return several of the same authors. Therefore, I leave the query one by one to replace duplicate authors in the result set, and this breaks the deep pagination, because ultimately the other result set that is used to replace the duplicates ends with the pages before the standard search. I also tried aggregation that does not work with pages.

Is there any functionality for distributing or subtracting a documentโ€™s score depending on how many times a document of the same author (or field) appears?

+6
source share
3 answers

You cannot diversify elasticsearch collation. You can only random_seed to enroll documents and hope for the best. You can use something like a top hit aggregator to aggregate buckets per author, but you cannot group a group of buckets. Therefore pagination pagination.

See here for more details.

0
source

Why can't you use grouping ? Just register the user and determine the order for the group.

0
source

EDIT: before you run this answer just because it is related to Lutsena and not to the real answer to the question: 1. ElasticSearch is based on Lucene 2. What the OP wants to do is really hard to do, and I just tried To help...

You can try decaying here:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/query-dsl-function-score-query.html

However, this does not allow a backward link to previous requests from the current request (since the technique should match your use case)

I ran into a similar issue for you in the webapp, in which we used Lucene / Hibernate-Search, and I really did not get a satisfactory result, and that still bothers me.

I think itโ€™s better to try to get a good user experience by trying to implement the order differently.

-1
source

Source: https://habr.com/ru/post/979300/


All Articles