Updated Answer
We ended up removing the custom plugin from the original answer because it was hard to get it working in Elastic Cloud . Instead, we simply created a separate document for autocomplete and deleted them from all our other documents.
An object
public class Suggest{ private String autocompleteOutput; private Map<String, AutoComplete> autoComplete; @JsonCreator Suggest() { } public Suggest(String autocompleteOutput, Map<String, AutoComplete> autoComplete) { this.autocompleteOutput = autocompleteOutput; this.autoComplete = autoComplete; } public String getAutocompleteOutput() { return autocompleteOutput; } public void setAutocompleteOutput(String autocompleteOutput) { this.autocompleteOutput = autocompleteOutput; } public Map<String, AutoComplete> getAutoComplete() { return autoComplete; } public void setAutoComplete(Map<String, AutoComplete> autoComplete) { this.autoComplete = autoComplete; } } public class AutoComplete { private String[] input; @JsonCreator AutoComplete() { } public AutoComplete(String[] input) { this.input = input; } public String[] getInput() { return input; } }
with the following display
{ "suggest": { "dynamic_templates": [ { "autocomplete": { "path_match": "autoComplete.*", "match_mapping_type": "*", "mapping": { "type": "completion", "analyzer": "lowercase_keyword_analyzer" } } } ], "properties": {} } }
This allows us to use the autocompleteOutput field from _source
Original answer
After some research, I finished creating a new Elasticsearch 5.1.1 plugin
Create a lucene filter
import org.apache.lucene.analysis.TokenFilter; import org.apache.lucene.analysis.TokenStream; import org.apache.lucene.analysis.tokenattributes.CharTermAttribute; import org.apache.lucene.analysis.tokenattributes.OffsetAttribute; import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute; import org.apache.lucene.analysis.tokenattributes.PositionLengthAttribute; import java.io.IOException; import java.util.*; public class PermutationTokenFilter extends TokenFilter { private final CharTermAttribute charTermAtt; private final PositionIncrementAttribute posIncrAtt; private final OffsetAttribute offsetAtt; private Iterator<String> permutations; private int origOffset; protected PermutationTokenFilter(TokenStream input) { super(input); this.charTermAtt = addAttribute(CharTermAttribute.class); this.posIncrAtt = addAttribute(PositionIncrementAttribute.class); this.offsetAtt = addAttribute(OffsetAttribute.class); } @Override public final boolean incrementToken() throws IOException { while (true) {
This filter will accept the original All Available Colors input token and rearrange it into all possible combinations (see original question)
Create a factory
import org.apache.lucene.analysis.TokenStream; import org.elasticsearch.index.analysis.AbstractTokenFilterFactory; import org.elasticsearch.common.settings.Settings; import org.elasticsearch.env.Environment; import org.elasticsearch.index.IndexSettings; public class PermutationTokenFilterFactory extends AbstractTokenFilterFactory { public PermutationTokenFilterFactory(IndexSettings indexSettings, Environment environment, String name, Settings settings) { super(indexSettings, name, settings); } public PermutationTokenFilter create(TokenStream input) { return new PermutationTokenFilter(input); } }
This class is required to provide a filter for the Elasticsearch plugin.
Create an Elasticsearch Plugin
Follow this guide to configure the necessary configuration for the Elasticsearch plugin.
<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>be.smartspoken</groupId> <artifactId>permutation-plugin</artifactId> <version>5.1.1-SNAPSHOT</version> <packaging>jar</packaging> <name>Plugin: Permutation</name> <description>Permutation plugin for elasticsearch</description> <properties> <lucene.version>6.3.0</lucene.version> <elasticsearch.version>5.1.1</elasticsearch.version> <java.version>1.8</java.version> <log4j2.version>2.7</log4j2.version> </properties> <dependencies> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-api</artifactId> <version>${log4j2.version}</version> </dependency> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-core</artifactId> <version>${log4j2.version}</version> </dependency> <dependency> <groupId>org.apache.lucene</groupId> <artifactId>lucene-test-framework</artifactId> <version>${lucene.version}</version> <scope>test</scope> </dependency> <dependency> <groupId>org.apache.lucene</groupId> <artifactId>lucene-core</artifactId> <version>${lucene.version}</version> <scope>provided</scope> </dependency> <dependency> <groupId>org.apache.lucene</groupId> <artifactId>lucene-analyzers-common</artifactId> <version>${lucene.version}</version> <scope>provided</scope> </dependency> <dependency> <groupId>org.elasticsearch</groupId> <artifactId>elasticsearch</artifactId> <version>${elasticsearch.version}</version> <scope>provided</scope> </dependency> </dependencies> <build> <resources> <resource> <directory>src/main/resources</directory> <filtering>false</filtering> <excludes> <exclude>*.properties</exclude> </excludes> </resource> </resources> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-assembly-plugin</artifactId> <version>2.6</version> <configuration> <appendAssemblyId>false</appendAssemblyId> <outputDirectory>${project.build.directory}/releases/</outputDirectory> <descriptors> <descriptor>${basedir}/src/main/assemblies/plugin.xml</descriptor> </descriptors> </configuration> <executions> <execution> <phase>package</phase> <goals> <goal>single</goal> </goals> </execution> </executions> </plugin> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <version>3.3</version> <configuration> <source>${java.version}</source> <target>${java.version}</target> </configuration> </plugin> </plugins> </build> </project>
Make sure that you are using the correct version of Elasticsearch, Lucene, and Log4J (2) in the pom.xml file and specify the correct configuration files.
import be.smartspoken.plugin.permutation.filter.PermutationTokenFilterFactory; import org.elasticsearch.index.analysis.TokenFilterFactory; import org.elasticsearch.indices.analysis.AnalysisModule; import org.elasticsearch.plugins.AnalysisPlugin; import org.elasticsearch.plugins.Plugin; import java.util.HashMap; import java.util.Map; public class PermutationPlugin extends Plugin implements AnalysisPlugin{ @Override public Map<String, AnalysisModule.AnalysisProvider<TokenFilterFactory>> getTokenFilters() { Map<String, AnalysisModule.AnalysisProvider<TokenFilterFactory>> extra = new HashMap<>(); extra.put("permutation", PermutationTokenFilterFactory::new); return extra; } }
provide factory plugin.
After you installed the new plugin, you need to restart your Elasticsearch.
Use plugin
Add a new custom analyzer that makes fun of 2.x functionality
Settings.builder() .put("number_of_shards", 2) .loadFromSource(jsonBuilder() .startObject() .startObject("analysis") .startObject("analyzer") .startObject("permutation_analyzer") .field("tokenizer", "keyword") .field("filter", new String[]{"permutation","lowercase"}) .endObject() .endObject() .endObject() .endObject().string()) .loadFromSource(jsonBuilder() .startObject() .startObject("analysis") .startObject("analyzer") .startObject("lowercase_keyword_analyzer") .field("tokenizer", "keyword") .field("filter", new String[]{"lowercase"}) .endObject() .endObject() .endObject() .endObject().string()) .build();
Now you only need to provide custom analyzers for matching objects.
{ "my_object": { "dynamic_templates": [{ "autocomplete": { "path_match": "my.autocomplete.object.path", "match_mapping_type": "*", "mapping": { "type": "completion", "analyzer": "permutation_analyzer", "search_analyzer": "lowercase_keyword_analyzer" } } }], "properties": { } } }
It will also improve performance because you no longer have to wait for permutations to be created.