Elasticsearch Huge Data Index

Question

Elasticsearch Huge Data Index

I am new to elastics search and have huge data (over 16 thousand huge rows in mysql table). I need to push this data to search for elastics and run into problems that index it. Is there a way to speed up data indexing? How to deal with huge data?

+6

elasticsearch

Vipul May 21 '12 at 7:41

source share

3 answers

Kirk backus · Answer 1 · 2014-05-29T20:25:06+0000

Bulk API Extension

You will make a POST request for /_bulk

Your payload will conform to the following format, where \n is the newline character.

 action_and_meta_data\n optional_source\n action_and_meta_data\n optional_source\n ...

Make sure your json is not printed.

Available actions are index , create , update and delete .

Bulk upload example

To answer your question, you just want voluminous data to be loaded into your index.

 { "create" : { "_index" : "test", "_type" : "type1", "_id" : "3" } } { "field1" : "value3" }

The first line contains the action and metadata. In this case, we call create . We will enter a document of type type1 in an index called test with a manually assigned identifier of 3 (instead of automatically generating elasticsearch).

The second line contains all the fields in your mapping, which in this example are simply field1 with value3 .

You just concatenate as much as you want to insert into your index.

Nate · Answer 2 · 2013-08-19T12:24:23+0000

It may be an old thread, but I would still comment on this for anyone looking for a solution to this problem. The JDBC River Plugin for Elastic Search is very useful for importing data from a wide range of supported DBs.

The link to the source of the JDBC river is here .. Using the Git Bash 'l PUT command, the following configuration file that allows data exchange between an ES instance and a MySQL instance -

 curl -XPUT 'localhost:9200/_river/uber/_meta' -d '{ "type" : "jdbc", "jdbc" : { "strategy" : "simple", "driver" : "com.mysql.jdbc.Driver", "url" : "jdbc:mysql://localhost:3306/elastic", "user" : "root", "password" : "root", "sql" : "select * from tbl_indexed", "poll" : "24h", "max_retries": 3, "max_retries_wait" : "10s" }, "index": { "index": "uber", "type" : "uber", "bulk_size" : 100 } }'

Make sure that the plugin directory of river-jdbc has mysql-connector-java-VERSION-bin , which contains the necessary jdbc-river JAR files.

devendram · Answer 3 · 2012-05-24T08:44:12+0000

Try surround api

http://www.elasticsearch.org/guide/reference/api/bulk.html

Elasticsearch Huge Data Index

Bulk API Extension

Bulk upload example

More articles: