Solr Snow Barrier incompatible with Spanish

Question

Solr Snow Barrier incompatible with Spanish

I have this field:

<fieldtype name="textes" class="solr.TextField"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords-es.txt" enablePositionIncrements="true"/> <filter class="solr.SnowballPorterFilterFactory" language="Spanish" protected="protwords-es.txt"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/> <filter class="solr.SnowballPorterFilterFactory" language="Spanish" protected="protwords-es.txt"/> </analyzer> </fieldtype>

The expected result of the search query alquileres (annuity) will match the alquiler (annuity). But when I go to "Field Analysis" on the Solr Admin website and check the alquiler index alquiler and the alquileres query alquileres , the following happens:

When indexing an alquiler it falls into alquil .
When requesting alquileres it gets into alquiler .

Thus, the simple case of searching for a plural form of a word ( alquileres ) would not correspond to its special form ( alquiler ).

Should not both indexes and queries be inserted into the same trunk (either alquiler or alquil )? Is this a limitation of the algorithm or a misunderstanding / misconfiguration on my part?

+4

solr stemming porter-stemmer

Chewie Dec 05 '11 at 14:07

source share

3 answers

This link works correctly for alquileres

http://www.molinolabs.com/lematizador.html#alquileres

+2

dimid Jun 20 '13 at 20:54

source share

I am using hunspell from openoffice and this is a great job.

My example:

 URL-Elastic/_analyze?analyzer=es_AR&text=alquileres

And return:

 { tokens: [ { token: "alquiler", start_offset: 0, end_offset: 10, type: "<ALPHANUM>", position: 1 } ] }

Link: https://www.openoffice.org/download/index.html

0

Facundo Oct 05 '15 at 19:09

source share

Romain meresse · Accepted Answer · 2011-12-07T15:57:20+0000

Snowboarding is very limited ... You will get the best result using the dictionary (Hunspell stemmer): http://wiki.apache.org/solr/Hunspell p>

Solr Snow Barrier incompatible with Spanish

More articles: