What is the maximum value of a lucene score?

I am thinking of the default evaluation function for which StandardAnalyzer was used.

The value seems to sometimes exceed 1.0.

+4
source share
3 answers

In fact, there is no maximum rating.

When Lutsen does this by scoring, he basically sums up the totality of points to give a total score.

For instance:

Suppose I'm looking for A OR B This request is divided into its constituent parts - A and B Each part of this query is executed independently, using a sub-collector and a given rating for the corresponding part of the query. If the document contains both A and B , the score will be a combination of points from both sub-counters.

Since there may be many sub-counters, the total score may be greater than 1.

The score for a particular hit is absolute, which means that it can only be used to compare with the highest score from the same search. The ratings of different searches are not directly comparable.

If you really need a value from 0 to 1, you can normalize each point depending on the ratio of its value to the highest search result. This will give you the equivalent of a percentage point. These percentages still cannot be compared between searches.

More information can be found here here and here .

+8
source

The maximum score depends on the query being performed. To find out what is the maximum score for a given query, you can request a score using the fl parameter, it must be explicitly requested.

 Ex Req: http://server:7983/solr/select/?q=term&fl=*,score 

find maxScore = "xx.xxxx" in your answer and it will be above / below 1.0, depending on the query, results, relevance ...

 Ex: <result name="response" numFound="29" start="0" maxScore="2.1740298"> 

It is important to remember that the value of the score itself does not matter much, but when comparing the relative rating of a document with that of a maxScore query, it provides value. For example, if the score for document No. 1 is 1.9, and for document No. 27 it is 0.8, then document No. 1 is far superior to document No. 27 when maxScore is "2.1740298."

Below is the rating,

  • Reverse Document Frequency
  • Period of time
  • Coordinating factor
  • Field length

besides these features like

  • Time Index Acceleration
  • Request time increase

will affect how the bill is calculated. SolrRelevancy offers some explanation. A more detailed explanation can be found here Lucene similarities You can enable the debug option to find out how the score is calculated,

 http://server:7983/solr/select/?q=term&fl=*,score&debugQuery=on 

Example: 2.1740298 = fieldWeight (text: "mmdci bldleg 02" at 210), product: 1.7320508 = tf (phrase Freq = 3.0) 13.388552 = idf (text: mmdci = 812 bldleg = 264 02 = 6220) 0.09375 = fieldNorm (field = text, doc = 210)

For Lucene:

Use TopDocs.getMaxScore () . Returns the maximum score of all matches when sorting by default by relevance. If you are sorting by a field other than relevance, you need to set doTrackScores (true) and doMaxScore (true) .

+5
source

Here is a page that describes how Lucene scores are calculated:

http://lucene.apache.org/java/3_0_0/scoring.html

+1
source

Source: https://habr.com/ru/post/1391716/


All Articles