Using SOLR autocomplete for multiple terms (i.e., comma-separated locations)

I have SOLR and it works, indexing data through DIH and correctly returning results for queries. I am trying to configure another kernel to run an examiner in order to autocomplete geographic locations. We have a web application that should take the city, state / region, country. We would like to do it in one window. Here are some examples:

Brooklyn, New York, United States
Philadelphia, Pennsylvania, USA
Barcelona, โ€‹โ€‹Catalonia, Spain

Suppose now that every place around the world can be divided into this 3-element entrance. I installed DIH to create a TemplateTransformer field that combines 4 tables (city, state, and country โ€” all independent tables connected to each other by the wizard table) in a field called "fullplacename":

<field column="fullplacename" template="${city_join.plainname}, ${region_join.plainname}, ${country_join.plainname}"/> 

I defined the text_auto field in schema.xml:

 <fieldType class="solr.TextField" name="text_auto"> <analyzer> <tokenizer class="solr.KeywordTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> 

and also defined these two fields:

 <field name="name_autocomplete" type="text_auto" indexed="true" stored="true" multiValued="true" /> <copyField source="fullplacename" dest="name_autocomplete" /> 

Now, here is my problem. This works fine for the first term, i.e. If I type "brooklyn", I get the expected results using this URL for the query:

  http: // localhost: 8983 / solr / places / suggest? q = brooklyn 

However, as soon as I put a comma and / or space there, it breaks them into 2 sentences, and I get a sentence for each:

  http: // localhost: 8983 / solr / places / suggest? q = brooklyn% 2C% 20ny 

Gives me a sentence for "brooklyn" and a sentence for "ny" instead of a sentence that matches "brooklyn, ny". I tried every solution I can find through Google and I was out of luck. Is there something simple that I missed, or is this the wrong approach?

Thanks!

EDIT: Just in case, here is the definition of searchComponent and requestHandler:

 <requestHandler name="/suggest" class="org.apache.solr.handler.component.SearchHandler"> <lst name="defaults"> <str name="spellcheck">true</str> <str name="spellcheck.dictionary">suggest</str> <str name="spellcheck.count">10</str> </lst> <arr name="components"> <str>suggest</str> </arr> </requestHandler> <searchComponent name="suggest" class="solr.SpellCheckComponent"> <lst name="spellchecker"> <str name="name">suggest</str> <str name="classname">org.apache.solr.spelling.suggest.Suggester</str> <str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str> <str name="field">name_autocomplete</str>`<br/> </lst> </searchComponent> 
+4
source share
3 answers

The problem is the adviser. Like the spellchecker, it symbolizes spaces.

http://lucene.472066.n3.nabble.com/suggester-issues-tp3262718p3266140.html has a solution to this problem.

+2
source

You are using a KeywordTokenizer that will not create separate tokens for "Brooklyn", "NY" and "United States".

Your sample queries are not like autocomplete, but more like regular search queries.

Query auto-completion (IMHO) contains only partial terms:

 http://localhost:8983/solr/places/suggest?q=brook 

for listings ahead. You want to use EdgeNGram for this: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.EdgeNGramFilterFactory Most likely in combination with StandardTokenizer and / or WordDelimiterFilterFactory.

Request example:

 http://localhost:8983/solr/places/suggest?q=brooklyn%2C%20ny 

The StandardTokenizer in combination with the LowercaseFilter and smax request handler with a good mm parameter configuration - restricting hits to those that contain all input conditions - will work well, see http://wiki.apache.org/solr/DisMaxQParserPlugin#mm_.28Minimum_. 27Should.27_Match.29

0
source

I feel the accepted answer is too complicated. An elegant way to do this would be to use http://localhost:8983/solr/places/suggest?spellcheck.q=brooklyn instead of http://localhost:8983/solr/places/suggest?q=brooklyn . As mentioned here

0
source

Source: https://habr.com/ru/post/1388985/


All Articles