Solrj Query - enter the most relevant entry first

I have several documents in Solr 4.0. I want to display the most relevant entries first, and then the less relevant ones.

For example, I have 3 documents with headings as follows:

  • Towards a revenue sharing policy
  • Income distribution and economic policy.
  • Distribution of income policies in developing countries

Now, when I request something like q=title:Income Distribution Policy,

I would like for document number 3 to be shown first (since the first 3 words are an exact match), then I want document number 1 to be displayed second (as with the exception of “Towards” the remaining match), then I want document number 2 to display (since there are several words between them).

My schema.xmllooks like this:

<types>
  <fieldType name="search" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SnowballPorterFilterFactory" language="German2" />
    <filter class="solr.PorterStemFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SnowballPorterFilterFactory" language="German2" />
    <filter class="solr.PorterStemFilterFactory"/>
  </analyzer>
</fieldType>
</types>

<fields>
   <field name="title" type="search" indexed="true" stored="true"/>
</fields>

EDIT 1 Debug output

"rawquerystring": "title:Income Distribution Policy",
"querystring": "title:Income Distribution Policy",
"parsedquery": "title:incom title:distribut title:polici",
"parsedquery_toString": "title:incom title:distribut title:polici"

EDIT 2 Field type changed

I used the following combination, but the result is the same.

  • StandardTokenizerFactory - autoGeneratePhraseQueries (none) - PorterStemFilterFactory.
  • StandardTokenizerFactory - autoGeneratePhraseQueries = "true" - PorterStemFilterFactory.
  • StandardTokenizerFactory - autoGeneratePhraseQueries (none).
  • StandardTokenizerFactory - autoGeneratePhraseQueries = "true".
  • WhitespaceTokenizerFactory - autoGeneratePhraseQueries (none) - PorterStemFilterFactory.
  • WhitespaceTokenizerFactory - autoGeneratePhraseQueries = "true" - PorterStemFilterFactory.
  • WhitespaceTokenizerFactory - autoGeneratePhraseQueries (none).
  • WhitespaceTokenizerFactory - autoGeneratePhraseQueries = "true".
+1
2

"" , .

- q=title:"Income Distribution Policy" OR title:Income Distribution Policy.

1, 3, 2. , .

+1

, /. , , , .

, eDismax . , ( ) pf ( ) .

autoGeneratePhraseQueries, TypeType.

, , debugQuery = true , . , debug.explain.structured = true , .

+1

Source: https://habr.com/ru/post/1707961/


All Articles