I have the following data for indexing on ElasticSearch.

I want to implement the autocomplete function and highlight why a particular document matches the request.
These are my index settings:
{
"settings": {
"number_of_shards": 1,
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 15
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"autocomplete_filter"
]
}
}
}
}
}
Index Analysis
- Separation of text at word boundaries.
- Removes pontuation.
- lower case
- Edge NGrams each token
So, the inverted index is as follows:

This is how I defined the mappings for the name field:
{
"index_type": {
"properties": {
"name": {
"type": "string",
"index_analyzer": "autocomplete",
"search_analyzer": "standard"
}
}
}
}
When I request:
GET http://localhost:9200/index/type/_search
{
"query": {
"match": {
"name": "soft"
}
},
"highlight": {
"fields" : {
"name" : {}
}
}
}
Search: soft
Using a standard tokenizer, “soft” is a term that can be found on an inverted index. This search matches the documents: 1, 3, 4, 5, 6, 7, which is true, but the highlighted part that I would expect to be “soft” rather than the whole word:
{
"hits": [
{
"_source": {
"name": "SoftwareRocks everytime"
},
"highlight": {
"name": [
"<em>SoftwareRocks</em> everytime"
]
}
},
{
"_source": {
"name": "Software AG"
},
"highlight": {
"name": [
"<em>Software</em> AG"
]
}
},
{
"_source": {
"name": "Software AG2"
},
"highlight": {
"name": [
"<em>Software</em> AG2"
]
}
},
{
"_source": {
"name": "Op Software AG good software better"
},
"highlight": {
"name": [
"Op <em>Software</em> AG good <em>software</em> better"
]
}
},
{
"_source": {
"name": "Op Software AG"
},
"highlight": {
"name": [
"Op <em>Software</em> AG"
]
}
},
{
"_source": {
"name": "is soft ware ok"
},
"highlight": {
"name": [
"is <em>soft</em> ware ok"
]
}
}
]
}
Search: ag software
, " " " " "" , . : 1, 3, 4, 5, 6, , , " " "" , " " "" :
{
"hits": [
{
"_source": {
"name": "Software AG"
},
"highlight": {
"name": [
"<em>Software</em> <em>AG</em>"
]
}
},
{
"_source": {
"name": "Software AG2"
},
"highlight": {
"name": [
"<em>Software</em> <em>AG2</em>"
]
}
},
{
"_source": {
"name": "Op Software AG"
},
"highlight": {
"name": [
"Op <em>Software</em> <em>AG</em>"
]
}
},
{
"_source": {
"name": "Op Software AG good software better"
},
"highlight": {
"name": [
"Op <em>Software</em> <em>AG</em> good <em>software</em> better"
]
}
},
{
"_source": {
"name": "SoftwareRocks everytime"
},
"highlight": {
"name": [
"<em>SoftwareRocks</em> everytime"
]
}
}
]
}
elasticsearch, , . , , .
- , ?
, -, - ElasticSearch . , .
, , ElasticSearch, , ( , , ).
( PHP):
public function search($term)
{
$params = [
'index' => $this->getIndexName(),
'type' => $this->getIndexType(),
'body' => [
'query' => [
'match' => [
'name' => $term
]
]
]
];
$results = $this->client->search($params);
$hits = $results['hits']['hits'];
$data = [];
$wrapBefore = '<strong>';
$wrapAfter = '</strong>';
foreach ($hits as $hit) {
$data[] = [
$hit['_source']['id'],
$hit['_source']['name'],
preg_replace("/($term)/i", "$wrapBefore$1$wrapAfter", strip_tags($hit['_source']['name']))
];
}
return $data;
}
, :

, , ElasticSearch , .