Finding duplicate field values ​​in elasticsearch

Using elasticsearch 0.19.4 (I know this is old, but its what the dependency requires)

I have a "digest" field in the elasticsearch index - and I would like to execute a query that will return me all cases when there are duplicate digest values. It can be done?

For entries that have duplicate values, I would like to return other values, such as "url", which cannot be duplicated.

+4
source share
1 answer

You can use Terms Aggregation for this.

 POST <index>/<type>/_search?search_type=count { "aggs": { "duplicateNames": { "terms": { "field": "digest", "size": 0, "min_doc_count": 2 } } } } 

This will return all digest field values ​​that appear in at least two documents. I agree that this is not appropriate for your use case, but may help.

+3
source

Source: https://habr.com/ru/post/1491135/


All Articles