Top_hits subaggregate filtering based on common documents

Question

Top_hits subaggregate filtering based on common documents

I am doing map clustering using Geosystems grid aggregation in Elasticsearch. The request is returned on average for 100-200 buckets. Each bucket uses top_hits aggregation, which I use to return 3 documents for each aggregated cluster.

The problem is that I want to return top_hits only when the parent aggregation (GeoHash) combines no more than three documents.

If a cluster combines more than three documents, I do not want the ES to return any documents for this cluster (because I will not use them).

I tried using Bucket Selector Aggregation , but failed to create the correct bucket_path. I use bucket selector aggregation at the same level as top_hits aggregation. The number of shared documents for the bucket is available in top_hits.hits.total , but I get reason=path not supported for [top_hits]: [hits.total] .

Is this possible in elasticsearch? This is important to me because in most queries, only a small percentage of buckets will have less than three documents. But the top of the sub-aggregation always returns the top 3 documents, even for clusters of 1000 documents. If the query result returns 200 buckets, and only 5 of them aggregate <= 3 documents, so I want to return only 5 * 3 documents, not 200 * 3 (then the Te response in this case is 10 MB).

Here is part of my request:

 "clusters": { "geohash_grid": { "field": "coordinates", "precision": 3 }, "aggs": { "top_hits": { "top_hits": { "size": 3 } }, "top_hits_filter": { "bucket_selector": { "buckets_path": { "total_hits": "top_hits._count" // tried top_hits.hits.total }, "script": { "inline": "total_hits <= 3" } } } } }

+5

elasticsearch elasticsearch-5

mbudnik Mar 11 '16 at 11:01

source share

1 answer

Andrei Stefan · Answer 1 · 2017-12-08T07:00:40+0000

Try this @ilivewithian:

  "aggs": { "clusters": { "geohash_grid": { "field": "coordinates", "precision": 3 }, "aggs": { "top_hits": { "top_hits": { "size": 3 } }, "top_hits_filter": { "bucket_selector": { "buckets_path": { "total_hits": "_count" }, "script": { "inline": "params.total_hits <= 3" } } } } } }

Top_hits subaggregate filtering based on common documents

More articles: