How to access doc values ​​of nested array in Elasticsearch script?

Given the following index, how to select the correct element in a nested array and access one of its values? The goal here is to use it inside the value inside script_score .

 # Create mapping curl -XPUT localhost:9200/test/user/_mapping -d ' { "user" : { "properties" : { "name" : { "type" : "string" }, "skills" : { "type": "nested", "properties" : { "skill_id" : { "type" : "integer" }, "recommendations_count" : { "type" : "integer" } } } } } } ' # Indexing Data curl -XPUT localhost:9200/test/user/1 -d ' { "name": "John", "skills": [ { "skill_id": 100, "recommendations_count": 5 }, { "skill_id": 200, "recommendations_count": 3 } ] } ' curl -XPUT localhost:9200/test/user/2 -d ' { "name": "Mary", "skills": [ { "skill_id": 100, "recommendations_count": 9 }, { "skill_id": 200, "recommendations_count": 0 } ] } ' 

My query filters by skill_id and it works well. Then I want to use script_score to increase the rating of user documents with a higher recommendations_count for this id_ skill. (<is the key).

 curl -XPOST localhost:9200/test/user/_search -d ' { "query":{ "function_score":{ "query":{ "bool":{ "must":{ "nested":{ "path":"skills", "query":{ "bool":{ "must":{ "term":{ "skill_id":100 } } } } } } } }, "functions":[ { "script_score": { "script": "sqrt(1.2 * doc['skills.recommendations_count'].value)" } } ] } } } } ' 

How do I access the skills array from a script , find the "skill_id: 100" element in the array, and then use its recommendations_count value? script_score above does not currently work (the score is always 0 regardless of the data, so I assume that doc['skills.recommendations_count'].value does not look in the right place.

+2
source share
1 answer

For your specific question, the script needs a nested context, as is the case with the term request.

This can be rewritten for ES 1.x:

 curl -XGET 'localhost:9200/test/_search' -d' { "query": { "nested": { "path": "skills", "query": { "filtered": { "filter": { "term": { "skills.skill_id": 100 } }, "query": { "function_score": { "functions": [ { "script_score": { "script": "sqrt(1.2 * doc['skills.recommendations_count'].value)" } } ] } } } } } } }' 

For ES 2.x (filters became first-class citizens in ES 2.x, so the syntax has changed a bit to catch up!):

 curl -XGET 'localhost:9200/test/_search' -d' { "query": { "nested": { "path": "skills", "query": { "bool": { "filter": { "term": { "skills.skill_id": 100 } }, "must": { "function_score": { "functions": [ { "script_score": { "script": "sqrt(1.2 * doc['skills.recommendations_count'].value)" } } ] } } } } } } }' 

Note. I made a term a term request because it does not have a logical effect on the evaluation (this is either an exact match or not). I also added the name of the nested field to the term filter, which is a requirement in Elasticsearch 2.x and later (and earlier practice).

With this in mind, you can (and should) avoid using a script whenever possible. This is one such case. function_score supports the concept of the field_value_factor function, which allows you to do what you are trying, but completely without a script . You can also specify a “missing” value to control what happens if the field is missing.

This means exactly the same script, but it will work better:

 curl -XGET 'localhost:9200/test/_search' -d' { "query": { "nested": { "path": "skills", "query": { "filtered": { "filter": { "term": { "skills.skill_id": 100 } }, "query": { "function_score": { "functions": [ { "field_value_factor": { "field": "skills.recommendations_count", "factor": 1.2, "modifier": "sqrt", "missing": 0 } } ] } } } } } } }' 

For ES 2.x:

 curl -XGET 'localhost:9200/test/_search' -d' { "query": { "nested": { "path": "skills", "query": { "bool": { "filter": { "term": { "skills.skill_id": 100 } }, "must": { "function_score": { "functions": [ { "field_value_factor": { "field": "skills.recommendations_count", "factor": 1.2, "modifier": "sqrt", "missing": 0 } } ] } } } } } } }' 

The scripts are slow and they also involve using fielddata in Elasticsearch 1.x, which is bad . You mentioned the values ​​of doc, which is a promising start that suggests using Elasticsearch 2.x, but that can only be terminology.

If you are just starting out with Elasticsearch, I highly recommend starting with the latest version.

+6
source

Source: https://habr.com/ru/post/978485/


All Articles