How to access doc values of nested array in Elasticsearch script?

Question

How to access doc values of nested array in Elasticsearch script?

Given the following index, how to select the correct element in a nested array and access one of its values? The goal here is to use it inside the value inside script_score .

 # Create mapping curl -XPUT localhost:9200/test/user/_mapping -d ' { "user" : { "properties" : { "name" : { "type" : "string" }, "skills" : { "type": "nested", "properties" : { "skill_id" : { "type" : "integer" }, "recommendations_count" : { "type" : "integer" } } } } } } ' # Indexing Data curl -XPUT localhost:9200/test/user/1 -d ' { "name": "John", "skills": [ { "skill_id": 100, "recommendations_count": 5 }, { "skill_id": 200, "recommendations_count": 3 } ] } ' curl -XPUT localhost:9200/test/user/2 -d ' { "name": "Mary", "skills": [ { "skill_id": 100, "recommendations_count": 9 }, { "skill_id": 200, "recommendations_count": 0 } ] } '

My query filters by skill_id and it works well. Then I want to use script_score to increase the rating of user documents with a higher recommendations_count for this id_ skill. (<is the key).

 curl -XPOST localhost:9200/test/user/_search -d ' { "query":{ "function_score":{ "query":{ "bool":{ "must":{ "nested":{ "path":"skills", "query":{ "bool":{ "must":{ "term":{ "skill_id":100 } } } } } } } }, "functions":[ { "script_score": { "script": "sqrt(1.2 * doc['skills.recommendations_count'].value)" } } ] } } } } '

How do I access the skills array from a script , find the "skill_id: 100" element in the array, and then use its recommendations_count value? script_score above does not currently work (the score is always 0 regardless of the data, so I assume that doc['skills.recommendations_count'].value does not look in the right place.

+2

groovy elasticsearch

brupm Dec 23 '15 at 7:57

source share

1 answer

pickypg · Accepted Answer · 2015-12-23T22:07:17+0000

For your specific question, the script needs a nested context, as is the case with the term request.

This can be rewritten for ES 1.x:

 curl -XGET 'localhost:9200/test/_search' -d' { "query": { "nested": { "path": "skills", "query": { "filtered": { "filter": { "term": { "skills.skill_id": 100 } }, "query": { "function_score": { "functions": [ { "script_score": { "script": "sqrt(1.2 * doc['skills.recommendations_count'].value)" } } ] } } } } } } }'

For ES 2.x (filters became first-class citizens in ES 2.x, so the syntax has changed a bit to catch up!):

 curl -XGET 'localhost:9200/test/_search' -d' { "query": { "nested": { "path": "skills", "query": { "bool": { "filter": { "term": { "skills.skill_id": 100 } }, "must": { "function_score": { "functions": [ { "script_score": { "script": "sqrt(1.2 * doc['skills.recommendations_count'].value)" } } ] } } } } } } }'

Note. I made a term a term request because it does not have a logical effect on the evaluation (this is either an exact match or not). I also added the name of the nested field to the term filter, which is a requirement in Elasticsearch 2.x and later (and earlier practice).

With this in mind, you can (and should) avoid using a script whenever possible. This is one such case. function_score supports the concept of the field_value_factor function, which allows you to do what you are trying, but completely without a script . You can also specify a “missing” value to control what happens if the field is missing.

This means exactly the same script, but it will work better:

 curl -XGET 'localhost:9200/test/_search' -d' { "query": { "nested": { "path": "skills", "query": { "filtered": { "filter": { "term": { "skills.skill_id": 100 } }, "query": { "function_score": { "functions": [ { "field_value_factor": { "field": "skills.recommendations_count", "factor": 1.2, "modifier": "sqrt", "missing": 0 } } ] } } } } } } }'

For ES 2.x:

 curl -XGET 'localhost:9200/test/_search' -d' { "query": { "nested": { "path": "skills", "query": { "bool": { "filter": { "term": { "skills.skill_id": 100 } }, "must": { "function_score": { "functions": [ { "field_value_factor": { "field": "skills.recommendations_count", "factor": 1.2, "modifier": "sqrt", "missing": 0 } } ] } } } } } } }'

The scripts are slow and they also involve using fielddata in Elasticsearch 1.x, which is bad . You mentioned the values of doc, which is a promising start that suggests using Elasticsearch 2.x, but that can only be terminology.

If you are just starting out with Elasticsearch, I highly recommend starting with the latest version.

How to access doc values ​​of nested array in Elasticsearch script?

More articles:

How to access doc values of nested array in Elasticsearch script?