Oracle text definescore stacking and query rewriting

I use Oracle text to search in the sentence case. I want scoring to be like counting only discrete occurrences,

Example: My query ( dog cat table ) If he finds the term “dog”, he should count 1, even if the sentence has more than one “dog” term. If he found a "dog cat", he should count 2 ... etc.

I used this query, but it gives me 51 if it finds two conditions. I need to accumulate discrete occurrences. Therefore, I want to redefine the behavior of the Oracle Text count algorithm.

  select /*+ FIRST_ROWS(1)*/ sentence_id ,score(1) as sc , isn ,sentence_length from plag_docsentences where contains(PROCESSED_TEXT,'DEFINESCORE(dog, DISCRETE*.01) ,DEFINESCORE(cat, DISCRETE*.01)' ,1)>0 order by score(1) desc 
+5
source share
1 answer

OK, I solved this problem.

suppose I find 2 members out of 3, the score will be 67 which means (2/3 = 67), this is the default behavior for scoring oracle text. so I got an equation to find the number of occurrences (i.e. the number of terms in the query found in the corpus clause) as follows:

x / query_lenght = rating / 100

then

x = query_lenght * score / 100

this will find the number of matching words between the request and the corpus request

I hope this helps researchers at IR.

+1
source

Source: https://habr.com/ru/post/1208447/


All Articles