Hyphenated index fields in Elasticsearch

I am trying to determine how to configure elasticsearch so that I can search for a query string using wildcards in fields containing hyphens.

I have documents that look like this:

{ "tags":[ "deck-clothing-blue", "crew-clothing", "medium" ], "name":"Crew t-shirt navy large", "description":"This is a t-shirt", "images":[ { "id":"ba4a024c96aa6846f289486dfd0223b1", "type":"Image" }, { "id":"ba4a024c96aa6846f289486dfd022503", "type":"Image" } ], "type":"InventoryType", "header":{ } } 

I tried using the word_delimiter filter and the space tokenizer:

 { "settings" : { "index" : { "number_of_shards" : 1, "number_of_replicas" : 1 }, "analysis" : { "filter" : { "tags_filter" : { "type" : "word_delimiter", "type_table": ["- => ALPHA"] } }, "analyzer" : { "tags_analyzer" : { "type" : "custom", "tokenizer" : "whitespace", "filter" : ["tags_filter"] } } } }, "mappings" : { "yacht1" : { "properties" : { "tags" : { "type" : "string", "analyzer" : "tags_analyzer" } } } } } 

But these are searches (for tags) and their results:

 deck* -> match deck-* -> no match deck-clo* -> no match 

Can anyone see where I'm wrong?

Thanks:)

+6
source share
1 answer

The parser is fine (although I would lose the filter), but your search parser is not specified, so it uses a standard parser to search for the tag field, which removes the hyphen and then tries to query it (run curl "localhost:9200/_analyze?analyzer=standard" -d "deck-*" to understand what I mean)

basically, "deck- *" is viewed as a "deck", it does not have a word that has only a "deck", so it fails.

deck-clo * is looked up as a clo deck *, again there is no word that is simply a deck or begins with clo, so the request fails.

I would make the following changes

 "analysis" : { "analyzer" : { "default" : { "tokenizer" : "whitespace", "filter" : ["lowercase"] <--- you don't need this, just thought it was a nice touch } } } 

then get rid of the special analyzer on tags

 "mappings" : { "yacht1" : { "properties" : { "tags" : { "type" : "string" } } } } 

let me know how this happens.

+8
source

Source: https://habr.com/ru/post/945616/


All Articles