Primarily. Now I notice that what I wrote here is not really defined. The documentation on how to use this is not readable for me. Using what I wrote above, I will expand. I am changing the name of the index to make a more pleasant example.
from datetime import datetime from elasticsearch_dsl import DocType, String, Date, Integer from elasticsearch_dsl.connections import connections from elasticsearch import Elasticsearch from elasticsearch_dsl import Search, Q
Above, we create 1 bucket per house number. Therefore, the name of the bucket will be the house number. ElasticSearch (ES) will always indicate the number of documents matching this bucket. Size = 0 means using all results, since ES has a default value to return only 10 results (or regardless of what its developer set to execute).
My mistake used to suggest that the Elastic Search query had aggregation by default. You define them yourself, and then execute them. Then your answer can be divided into the aggregators that you mentioned.
CURL for the above should look like this:
NOTE. I am using the SENSE ElasticSearch plugin / extension / add-on for Google Chrome. In SENSE, you can use // for comments.
POST /airbnb/sleep_overs/_search { // the size 0 here actually means to not return any hits, just the aggregation part of the result "size": 0, "aggs": { "by_house": { "terms": { // the size 0 here means to return all results, not just the the default 10 results "field": "house_number", "size": 0 } } } }
Work around. Someone from the GIT DSL told me to forget to translate and just use this method. It is simpler and you can just write hard stuff in CURL. That is why I call it a workaround.
# Define a default Elasticsearch client client = connections.create_connection(hosts=['http://blahblahblah:9200']) s = Search(using=client, index="airbnb", doc_type="sleep_overs") # how simple we just past CURL code here body = { "size": 0, "aggs": { "by_house": { "terms": { "field": "house_number", "size": 0 } } } } s = Search.from_dict(body) s = s.index("airbnb") s = s.doc_type("sleepovers") body = s.to_dict() t = s.execute() for item in t.aggregations.by_house.buckets: # item.key will the house number print item.key, item.doc_count
Hope this helps. Now I design everything in CURL, and then use the Python statement to clear the results to get what I want. This helps for multi-level aggregations (sub-aggregations).