1

Solr: Filter the number of matches in an OR request for a multi-valued field

Given the following example solr docs:

<doc> <field name="guid">1</field> <field name="name">Harry Potter</field> <field name="friends">ron</field> <field name="friends">hermione</field> <field name="friends">ginny</field> <field name="friends">dumbledore</field> </doc> <doc> <field name="guid">2</field> <field name="name">Ron Weasley</field> <field name="friends">harry</field> <field name="friends">hermione</field> <field name="friends">lavender</field> </doc> <doc> <field name="guid">3</field> <field name="name">Hermione Granger</field> <field name="friends">harry</field> <field name="friends">ron</field> <field name="friends">ginny</field> <field name="friends">dumbledore</field> </doc> 

and the following query (or filter query):

 friends:ron OR friends:hermione OR friends:ginny OR friends:dumbledore 

all three documents will be returned, since each of them has at least one of these friends.

However, I would like to set a minimum (and maximum) threshold for how many friends are matched. For example, return documents that have at least 2, but no more than 3 specified friends.

Such a request will only return the third document (Hermione Granger), since she has 3 of 4 friends she knows, while the first (Harry Potter) matches all 4, and the second (Ron Weasley) matches only 1.

Is this possible in a Solr request?

+4
source share
2 answers

You want to use the function request , termfreq and count the number of terms (for example, "friends" in your case) that matches. You can summarize the results and then return documents only within your threshold using frange , for example:

 {!frange l=2 u=3}sum(termfreq(friends,'ron'),termfreq(friends,'hermione'),termfreq(friends,'ginny'),termfreq(friends,'dumbledore')) 

termfreq(...) will return 1 for each friend found, and the sum of this is what you test against your threshold (the lower and upper boundaries that you specified at the beginning of your statement !frange ).

You can put this in the q: or fq: field. Here it is in the Solr admin panel for reference:

enter image description here

+6
source

The easiest way I see is to simply add the "nbOfFriends" field and fill it in the source or in the UpdateProcessor.

If you do not want to add this additional field, you can look at "Connections" , but I'm not sure that it allows you to specify the number of children in the connection, you should check.

0
source

Source: https://habr.com/ru/post/1480155/


All Articles