Solr query: stop words, OR and AND weirdness

We use Solr 3.5 with a schema that has the following field declaration:

<fieldType name="fieldN" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1" splitOnNumerics="0" preserveOriginal="1"/> <filter class="solr.LengthFilterFactory" min="2" max="256"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> <filter class="solr.PorterStemFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.LengthFilterFactory" min="2" max="256"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> <filter class="solr.PorterStemFilterFactory"/> </analyzer> </fieldType> 

When we send the request as follows:

 field1:"term1" 

Solr returns the results.

When we run this query, we still get the results:

 field1:"term1" AND (field2:term2 OR field3:term2) 

While term2 is a stop word, and term1 is a regular word.

But when we send a request like this:

 field1:"term1" AND (field2:term2 OR field3:term2 OR field4:term2) 

Returns nothing.

We also noticed that when we do something like:

 (field1:"term1" AND (field2:term2 OR field3:term2)) OR (field1:"term1" AND field4:term2) 

also works, but since the real query should look for one term in about 200 fields, this option is less preferable.

Thanks.

+6
source share
1 answer

I assume that your "wierdness" is more related to your solrconfig rules than to your request using stop words. I had similar problems with repeat requests in subqueries, and ultimately it was my minimum matching rule in my Dismax search handler.

Browse solrconfig.xml and find the requestHandler that uses your search. You must declare the string "mm" (Minimum Match). Try customizing your rules so that they are less or more restrictive, regardless of your purpose.

Good luck

+1
source

Source: https://habr.com/ru/post/912403/


All Articles