Solr stop words and empty request

I have a Solr instance with the number of documents and indexed field.

Now I want to apply the list of stops in the query to increase the number of results, completely ignoring during the query the words included in the list of stop words.

So in my configuration I use solr.StopFilterFactoryin the queryanalyzer.

I expect that if I performed a search with only one word that is in the list of stop words, then the result set will be the same as for the wildcard query text_title:*, i.e. a complete set of documents.

But instead, I get 0 results. Am I missing something about the behavior of the stop word filter?

+4
source share
1 answer

solr.StopFilterFactory

This filter discards or stops the analysis of tokens that are in the list of stopped words. A standard list of stop words is included in the Solr configuration directory named stopwords.txt, which is suitable for typical English text.

https://cwiki.apache.org/confluence/display/solr/Filter+Descriptions#FilterDescriptions-StopFilter

This filter actually removes the token that is in your request, and does not replace it with an *
Example:

In: "To be or what?"
Tokenizer to Filter: "To"(1), "be"(2), "or"(3), "what"(4)
Out: "To"(1), "what"(4)

Try using this filter.
solr.SuggestStopFilterFactory

Stop Filter, , . Stop Filter Stop Filter , , , .

StopFilterFactory , - .

- , .

:

<analyzer type="query">
  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
  <filter class="solr.LowerCaseFilterFactory"/>
  <filter class="solr.SuggestStopFilterFactory" ignoreCase="true" words="stopwords.txt" format="wordset"/>
</analyzer>

:

In: "The The"
Tokenizer to Filter: "the"(1), "the"(2)
Out: "the"(2)
0

Source: https://habr.com/ru/post/1668406/


All Articles