I have an elastic search index with a field for exact matches, and somehow I get as many identical results (which I donβt mind), and similar results are sorted to exact matches (what I do mind.)
Can someone explain what is happening and how to fix it?
My mapping looks like this
"exact":{ "type":"string", "boost":10.0, "analyzer":"keyword" },
My query looking for "AAPL P JAN 2014 885.00" is as follows:
{ "size" : 21, "query" : { "field" : { "exact" : "AAPL P JAN 2014 885,00" } }, "explain" : true, "sort" : [ { "_score" : { "order" : "desc" } } ], "facets" : { "category" : { "terms" : { "field" : "category", "size" : 10 } } } }
And the returned documents end in the following order:
- {"exact": ["APPLE INC", "US0378331005", "AAPL", "73773"], "id-compound": "AAPL"}
- {"exact": ["AAPL", "73773", "AAPL P JAN 2014 675.00"], "id-compound": "AAPL * PUT * 20140118 * 675"}
- {"exact": ["AAPL", "73773", "AAPL C JAN 2014 500.00"], "id-compound": "AAPL * CALL * 20140118 * 500"}
etc., with exact match with row results.
Can someone explain to me why the exact match does not end from above?
Search results with full explanation below if this helps to understand things.
"hits" : [ { "_shard" : 0, "_node" : "1", "_index" : "instruments", "_type" : "instrument", "_id" : "AAPL", "_score" : 1306.8339, "_source" : {"exact":["APPLE INC","US0378331005","AAPL","73773"],"id-compound":"AAPL"}, "_explanation" : { "value" : 1306.8339, "description" : "product of:", "details" : [ { "value" : 6534.169, "description" : "sum of:", "details" : [ { "value" : 6534.169, "description" : "weight(exact:AAPL in 9096), product of:", "details" : [ { "value" : 0.25854474, "description" : "queryWeight(exact:AAPL), product of:", "details" : [ { "value" : 6.1701355, "description" : "idf(docFreq=211, maxDocs=37299)" }, { "value" : 0.0419026, "description" : "queryNorm" } ] }, { "value" : 25272.875, "description" : "fieldWeight(exact:AAPL in 9096), product of:", "details" : [ { "value" : 1.0, "description" : "tf(termFreq(exact:AAPL)=1)" }, { "value" : 6.1701355, "description" : "idf(docFreq=211, maxDocs=37299)" }, { "value" : 4096.0, "description" : "fieldNorm(field=exact, doc=9096)" } ] } ] } ] }, { "value" : 0.2, "description" : "coord(1/5)" } ] } }, { "_shard" : 0, "_node" : "1", "_index" : "instruments", "_type" : "instrument", "_id" : "AAPL*PUT*20140118*675", "_score" : 163.35423, "_source" : {"exact":["AAPL","73773","AAPL P JAN 2014 675,00"],"id-compound":"AAPL*PUT*20140118*675"}, "_explanation" : { "value" : 163.35423, "description" : "product of:", "details" : [ { "value" : 816.7711, "description" : "sum of:", "details" : [ { "value" : 816.7711, "description" : "weight(exact:AAPL in 18), product of:", "details" : [ { "value" : 0.25854474, "description" : "queryWeight(exact:AAPL), product of:", "details" : [ { "value" : 6.1701355, "description" : "idf(docFreq=211, maxDocs=37299)" }, { "value" : 0.0419026, "description" : "queryNorm" } ] }, { "value" : 3159.1094, "description" : "fieldWeight(exact:AAPL in 18), product of:", "details" : [ { "value" : 1.0, "description" : "tf(termFreq(exact:AAPL)=1)" }, { "value" : 6.1701355, "description" : "idf(docFreq=211, maxDocs=37299)" }, { "value" : 512.0, "description" : "fieldNorm(field=exact, doc=18)" } ] } ] } ] }, { "value" : 0.2, "description" : "coord(1/5)" } ] } }, { "_shard" : 0, "_node" : "1", "_index" : "instruments", "_type" : "instrument", "_id" : "AAPL*CALL*20140118*500", "_score" : 163.35423, "_source" : {"exact":["AAPL","73773","AAPL C JAN 2014 500,00"],"id-compound":"AAPL*CALL*20140118*500"}, "_explanation" : { "value" : 163.35423, "description" : "product of:", "details" : [ { "value" : 816.7711, "description" : "sum of:", "details" : [ { "value" : 816.7711, "description" : "weight(exact:AAPL in 383), product of:", "details" : [ { "value" : 0.25854474, "description" : "queryWeight(exact:AAPL), product of:", "details" : [ { "value" : 6.1701355, "description" : "idf(docFreq=211, maxDocs=37299)" }, { "value" : 0.0419026, "description" : "queryNorm" } ] }, { "value" : 3159.1094, "description" : "fieldWeight(exact:AAPL in 383), product of:", "details" : [ { "value" : 1.0, "description" : "tf(termFreq(exact:AAPL)=1)" }, { "value" : 6.1701355, "description" : "idf(docFreq=211, maxDocs=37299)" }, { "value" : 512.0, "description" : "fieldNorm(field=exact, doc=383)" } ] } ] } ] }, { "value" : 0.2, "description" : "coord(1/5)" } ] } }, { "_id" : "AAPL*PUT*20140118*940", "_score" : 163.35423, "_source" : {"exact":["AAPL","73773","AAPL P JAN 2014 940,00"],"id-compound":"AAPL*PUT*20140118*940"}, "_explanation" : { "value" : 163.35423, "description" : "product of:", "details" : [ { "value" : 816.7711, "description" : "sum of:", "details" : [ { "value" : 816.7711, "description" : "weight(exact:AAPL in 794), product of:", "details" : [ { "value" : 0.25854474, "description" : "queryWeight(exact:AAPL), product of:", "details" : [ { "value" : 6.1701355, "description" : "idf(docFreq=211, maxDocs=37299)" }, { "value" : 0.0419026, "description" : "queryNorm" } ] }, { "value" : 3159.1094, "description" : "fieldWeight(exact:AAPL in 794), product of:", "details" : [ { "value" : 1.0, "description" : "tf(termFreq(exact:AAPL)=1)" }, { "value" : 6.1701355, "description" : "idf(docFreq=211, maxDocs=37299)" }, { "value" : 512.0, "description" : "fieldNorm(field=exact, doc=794)" } ] } ] } ] }, { "value" : 0.2, "description" : "coord(1/5)" } ] } }
and only if what happens if I analyze the data I'm trying to save:
curl -XGET 'localhost:9200/instruments/_analyze?field=exact&pretty=true' -d 'ING P JUN 2013 6.00' { "tokens" : [ { "token" : "ING P JUN 2013 6.00", "start_offset" : 0, "end_offset" : 20, "type" : "word", "position" : 1 } ]