How to configure my index to use BM25 in ElasticSearch using the JAVA API?

I am trying to migrate from a MySQL database to ElasticSearch in order to use the full-text search method using BMML similarity for each field. I use JAVA to retrieve records from MySQL and add them to the ElasticSearch index.

I am building my index using the JAVA index API , but I cannot figure out how to set the BM25 affinity over my fields.

I consider a table of table products from MySQL and dev as an index with products as an index type.

The source products of the table contain the following fields:

  • ID
  • title
  • Description

You can find the code on my github if you want to take a look. This is a forked project that I set up with Maven integration.

Any suggestion or any help is appreciated, thanks!

+4
source share
1 answer

I found the answer to my question.

Here is the code:

Settings settings = ImmutableSettings
            .settingsBuilder()
            .put("cluster.name", "es_cluster_name"))
            // Define similarity module settings
            .put("similarity.custom.type", "BM25")
            .put("similarity.custom.k1", 2.0f)
            .put("similarity.custom.b", 1.5f)
            .build();

Client client = new TransportClient(settings);

It seems that you can define the similarity modules that you want to use in the settings before creating an instance of your client.

The following is a list of similarity modules elasticsearchcurrently supported : default, BM25, DFR, IB, LMDirichlet and LMJelinekMercer. You can specify which one you want to use in the settings, as shown below:

   .put("similarity.custom.type", "..." )

Each semblance has its own parameters, which you would also like to configure for proper use.

Note: Code checked at elasticsearch1.1.0.

+5
source

Source: https://habr.com/ru/post/1536415/


All Articles