I would like to have, in addition to the standard term search with tf-idf similarity in a field of textual content, scoring based on the "similarity" of numeric fields. This similarity will depend on the distance between the value in the request and in the document (for example, gaussian with m = [user input], s = 0.5)
those. let's say documents represent people, and a personβs document has two fields:
- description (full text)
- age (number).
I want to find documents like
Description: (xyz) Age: 30
but age is not a filter , but rather part of the assessment (for a person with an age of 30, the multiplier will be 1.0, for a 25-year-old person 0.8, etc.).
Could this be achieved in a reasonable way?
EDIT: Finally, I found out that this can be done by wrapping ValueSourceQuery and TermQuery with CustomScoreQuery. See My solution below.
EDIT 2: With rapidly changing versions of Lucene, I just want to add that it has been tested on Lucene 3.0 (Java).
source share