In a very simple case, I have three documents with the file names "Lark", "Larker" and "Larking" (without a file extension). In solr, I index these three documents by matching the file name in the "title" field. When I do a Lark search, all three documents are returned (this is what I want), but they all get the same score. I would prefer Lark to be the highest since it matches my query exactly, and the rest are behind.
<field name="title" type="text_general" indexed="true" stored="true" multiValued="false"/>
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType>
I believe the reason they get the same score is because EdgeNGramFilterFactory used during the index. Each document is indexed as "La", "Lar", "Lark" with two documents ("Larker" and "Larking"), indexed with some additional options. Thus, each document is an exact match for the Lark request. I would like to somehow fulfill a query in which the term βLarkβ will return all three documents, but with a document called βLarkβ, which will be returned above the others.
Request debugging results:
<lst name="debug"> <str name="rawquerystring">Lark</str> <str name="querystring">Lark</str> <str name="parsedquery">text:lark</str> <str name="parsedquery_toString">text:lark</str> <lst name="explain"> <str name="543d6ee4cbb33c26bbcf288b/xxnullxx/543d6ef9cbb33c26bbcf2892"> 2.7104912 = (MATCH) weight(text:lark in 0) [DefaultSimilarity], result of: 2.7104912 = fieldWeight in 0, product of: 1.4142135 = tf(freq=2.0), with freq of: 2.0 = termFreq=2.0 3.8332133 = idf(docFreq=3, maxDocs=68) 0.5 = fieldNorm(doc=0) </str> <str name="543d6ee4cbb33c26bbcf288b/xxnullxx/543d6ef9cbb33c26bbcf2893"> 2.7104912 = (MATCH) weight(text:lark in 1) [DefaultSimilarity], result of: 2.7104912 = fieldWeight in 1, product of: 1.4142135 = tf(freq=2.0), with freq of: 2.0 = termFreq=2.0 3.8332133 = idf(docFreq=3, maxDocs=68) 0.5 = fieldNorm(doc=1) </str> <str name="543d6ee4cbb33c26bbcf288b/xxnullxx/543d6ef9cbb33c26bbcf2894"> 2.7104912 = (MATCH) weight(text:lark in 2) [DefaultSimilarity], result of: 2.7104912 = fieldWeight in 2, product of: 1.4142135 = tf(freq=2.0), with freq of: 2.0 = termFreq=2.0 3.8332133 = idf(docFreq=3, maxDocs=68) 0.5 = fieldNorm(doc=2) </str>