Strengthening Lucene conditions when building an index

Is it possible to determine that specific terms are more important than others when creating an index (and not when querying it)?

Consider, for example, the synonym filter:
doc 1: "it's a good car"
doc 2: "it's a good car"

I want to add the term “vehicle” to the first document and the term “car” to the second document, but I want that if later an index is requested with the word car, then the first document will be typed above the second, and if requested for the vehicle , it will be the other way around.

Will calling setBoost in the fields do the trick before adding them to the relevant documents?

Or maybe I should add synonyms to another field name?

Or am I looking at it from the wrong point of view?

thanks

+6
source share
1 answer

Strengthening the settings in the file affects all terms in this field, so this will not work in your case.

But this should be possible using Lucene's useful data (an array of bytes that can be set for each term). You would use them to install terminal special promotions (for example, for a vehicle up to 0.5 for document 1). You then implement your own Similarity method and override scorePayload() to decode this promotion, and then use PayloadTermQuery , which allows you to contribute to the rating based on the downloads that you have in the payload for this term.

+4
source

Source: https://habr.com/ru/post/906096/


All Articles