Recommended heading boost?

I have a relatively simple Lucen index, which is maintained by Solr. The index consists of two main fields, name and body, as well as several less important fields.

Most search engines are more relevant to the results with matches in the headline above the body. I'm going to start incrementing the index-time in the header field.

My question is: what values ​​do people usually use for their header fields? 2? 4? 10? one hundred?

+3
source share
1 answer

I suggest you divide the average body length by the median length of the header. This approximately gives you the coefficient M - for M the appearance of a word in the body, it will appear once in the title. Now use something like M * 3. This, of course, is a rational heuristic, and it is best to sort through the values. See the Ingersoll Grant “Debugging Search Relevance” for a more structured discussion.

+3
source

Source: https://habr.com/ru/post/1705362/


All Articles