Elasticsearch - common face structure - calculated aggregations in combination with filters

In our new project, we were inspired by this article http://project-a.imtqy.com/on-site-search-design-patterns-for-e-commerce/#generic-faceted-search to execute our "facet" structure . And while it works for me, as described in the article, I ran into problems in getting it to work when selecting faces. I hope someone can give a hint about something to try, so I don’t need to redo all of our clusters into separate aggregation calculations again.

The problem is that we use a single aggregation to calculate all the “faces” at once, but when I add a filter (for example, checking the brand name), then it “deletes” all other brands when returning the aggregates. Basically, I want him to use this brand as a filter when calculating other aspects, but not when calculating brand aggregates. This is necessary so that the user can, for example, select several brands.

Looking at https://www.contorion.de/search/Metabo_Fein/ou1-ou2?q=Winkelschleifer&c=bovy (this is the site described in this article), I selected "Metabo" and "Fein", (Hersteller), and expanding the Hersteller menu, it shows all the manufacturers, not just the selected ones. Therefore, I know that this is possible, and I hope that someone has a hint on how to write aggregates / filters, so I get the "correct behavior of the ecommerce facets."

In products in ES, I have the following structure: (same as in the original article, although "C # ified" when naming)

"attributeStrings": [ { "facetName": "Property", "facetValue": "Organic" }, { "facetName": "Property", "facetValue": "Without parfume" }, { "facetName": "Brand", "facetValue": "Adidas" } ] 

Thus, the above product has 2 attributes / facet groups - Property with two values ​​(organic, without perfume) and brand with 1 value (Adidas). Without any filters, I compute aggregates from the following query:

  "aggs": { "agg_attr_strings_filter": { "filter": {}, "aggs": { "agg_attr_strings": { "nested": { "path": "attributeStrings" }, "aggs": { "attr_name": { "terms": { "field": "attributeStrings.facetName" }, "aggs": { "attr_value": { "terms": { "field": "attributeStrings.facetValue", "size": 1000, "order": [ { "_term": "asc" } ] } } } } } } } } 

Now, if I select Property "Organic" and "Adidas", I will create the same aggregation, but with a filter, to apply these two restrictions (which would be bad ...):

  "aggs": { "agg_attr_strings_filter": { "filter": { "bool": { "filter": [ { "nested": { "query": { "bool": { "filter": [ { "term": { "attributeStrings.facetName": { "value": "Property" } } }, { "terms": { "attributeStrings.facetValue": [ "Organic" ] } } ] } }, "path": "attributeStrings" } }, { "nested": { "query": { "bool": { "filter": [ { "term": { "attributeStrings.facetName": { "value": "Brand" } } }, { "terms": { "attributeStrings.facetValue": [ "Adidas" ] } } ] } }, "path": "attributeStrings" } } ] } }, "aggs": { "agg_attr_strings": { "nested": { "path": "attributeStrings" }, "aggs": { "attr_name": { "terms": { "field": "attributeStrings.facetName", }, "aggs": { "attr_value": { "terms": { "field": "attributeStrings.facetValue", "size": 1000, "order": [ { "_term": "asc" } ] } } } } } } } } 

The only way I can see this model forward is to calculate the aggregation for each selected face and somehow combine the result. But it seems very difficult and peculiar defeat in the fact that this model has a model, as described in the article, so I hope that there will be a cleaner solution, and someone can give a hint about something to try.

+10
source share
3 answers

The only way I can see in this model is to calculate the aggregation for each selected facet and somehow combine the result.

This is absolutely true. If one aspect is selected (for example, a brand), you cannot use the global brand filter if you also want to select other brands for multiple selection. What you can do is apply all other filters to selected facets and all filters to unselected facets. As a result, you will have n+1 separate aggregates for n selected filters - the first for all facets, and the rest for the selected facets.

In your case, the query might look like this:

 { "aggs": { "agg_attr_strings_filter": { "filter": { "bool": { "filter": [ { "nested": { "query": { "bool": { "filter": [ { "term": { "attributeStrings.facetName": { "value": "Property" } } }, { "terms": { "attributeStrings.facetValue": [ "Organic" ] } } ] } }, "path": "attributeStrings" } }, { "nested": { "query": { "bool": { "filter": [ { "term": { "attributeStrings.facetName": { "value": "Brand" } } }, { "terms": { "attributeStrings.facetValue": [ "Adidas" ] } } ] } }, "path": "attributeStrings" } } ] } }, "aggs": { "agg_attr_strings": { "nested": { "path": "attributeStrings" }, "aggs": { "attr_name": { "terms": { "field": "attributeStrings.facetName" }, "aggs": { "attr_value": { "terms": { "field": "attributeStrings.facetValue", "size": 1000, "order": [ { "_term": "asc" } ] } } } } } } } }, "special_agg_property": { "filter": { "nested": { "query": { "bool": { "filter": [ { "term": { "attributeStrings.facetName": { "value": "Brand" } } }, { "terms": { "attributeStrings.facetValue": [ "Adidas" ] } } ] } }, "path": "attributeStrings" } }, "aggs": { "special_agg_property": { "nested": { "path": "attributeStrings" }, "aggs": { "agg_filtered_special": { "filter": { "query": { "match": { "attributeStrings.facetName": "Property" } } }, "aggs": { "facet_value": { "terms": { "size": 1000, "field": "attributeStrings.facetValue" } } } } } } } }, "special_agg_brand": { "filter": { "nested": { "query": { "bool": { "filter": [ { "term": { "attributeStrings.facetName": { "value": "Property" } } }, { "terms": { "attributeStrings.facetValue": [ "Organic" ] } } ] } }, "path": "attributeStrings" } }, "aggs": { "special_agg_brand": { "nested": { "path": "attributeStrings" }, "aggs": { "agg_filtered_special": { "filter": { "query": { "match": { "attributeStrings.facetName": "Brand" } } }, "aggs": { "facet_value": { "terms": { "size": 1000, "field": "attributeStrings.facetValue" } } } } } } } } } } 

This query looks very large and intimidating, but the generation of such a query can be done with a few dozen lines of code. When parsing query results, you must first analyze the general aggregation (which uses all filters) and after special faceted aggregations. In the agg_attr_strings_filter example, the first parsing of the results is done from agg_attr_strings_filter but these results will also contain aggregation values ​​for the brand and properties that must be overwritten by aggregation values ​​from special_agg_property and special_agg_brand In addition, this query is effective because Elasticsearch does an excellent job of caching individual filter offers, therefore, applying the same filters in different parts of the query should be cheap.

But it seems very complicated and in some way makes the presence of the model described in the article meaningless, so I hope that there is a cleaner solution, and someone can give a hint that you can try.

There really is no way around the fact that you need to apply different filters to different facets and at the same time have different query filters. If you need to maintain the “proper behavior of e-commerce aspects”, you will have a complex request :)

Disclaimer: I am a co-author of the article.

+14
source

The problem arises from the fact that you add a filter on Property and Organic inside your aggregation, therefore, the more aspects you choose, the more you hold back the conditions that you will receive. In this article, the filter that they use is actually post_filter , both names were resolved until recently, but filter removed because it caused ambiguity.

What you need to do is move this filter outside the aggregates to the post_filter section post_filter that the results are correctly filtered by any graphs, but all your faces are still correctly calculated on the entire document set.

 { "post_filter": { "bool": { "filter": [ { "nested": { "query": { "bool": { "filter": [ { "term": { "attributeStrings.facetName": { "value": "Property" } } }, { "terms": { "attributeStrings.facetValue": [ "Organic" ] } } ] } }, "path": "attributeStrings" } }, { "nested": { "query": { "bool": { "filter": [ { "term": { "attributeStrings.facetName": { "value": "Brand" } } }, { "terms": { "attributeStrings.facetValue": [ "Adidas" ] } } ] } }, "path": "attributeStrings" } } ] } }, "aggs": { "agg_attr_strings_full": { "nested": { "path": "attributeStrings" }, "aggs": { "attr_name": { "terms": { "field": "attributeStrings.facetName" }, "aggs": { "attr_value": { "terms": { "field": "attributeStrings.facetValue", "size": 1000, "order": [ { "_term": "asc" } ] } } } } } }, "agg_attr_strings_filtered": { "filter": { "bool": { "filter": [ { "nested": { "path": "attributeStrings", "query": { "bool": { "filter": [ { "term": { "attributeStrings.facetName": { "value": "Property" } } }, { "terms": { "attributeStrings.facetValue": [ "Organic" ] } } ] } } } }, { "nested": { "path": "attributeStrings", "query": { "bool": { "filter": [ { "term": { "attributeStrings.facetName": { "value": "Brand" } } }, { "terms": { "attributeStrings.facetValue": [ "Adidas" ] } } ] } } } } ] } }, "aggs": { "nested": { "path": "attributeStrings" }, "aggs": { "attr_name": { "terms": { "field": "attributeStrings.facetName" }, "aggs": { "attr_value": { "terms": { "field": "attributeStrings.facetValue", "size": 1000, "order": [ { "_term": "asc" } ] } } } } } } } } } 
+4
source

I am trying to get what you guys are discussing here. I also follow the article mentioned in this thread, but after three weeks of trying I can’t understand why the filters return the correct results, but when I put the filters in the aggregations, all the aggregations are wrong. I detailed all this with the help of examples in this question. I wonder if I have any pointers so I can get stuck. Thank you very much!

0
source

Source: https://habr.com/ru/post/1262001/


All Articles