Elasticearch Bulk Index JSON Data

I am trying to mass index a JSON file into a new Elasticsearch index and cannot do this. I have the following sample data in JSON

[{"Amount": "480", "Quantity": "2", "Id": "975463711", "Client_Store_sk": "1109"}, {"Amount": "2105", "Quantity": "2", "Id": "975463943", "Client_Store_sk": "1109"}, {"Amount": "2107", "Quantity": "3", "Id": "974920111", "Client_Store_sk": "1109"}, {"Amount": "2115", "Quantity": "2", "Id": "975463798", "Client_Store_sk": "1109"}, {"Amount": "2116", "Quantity": "1", "Id": "975463827", "Client_Store_sk": "1109"}, {"Amount": "648", "Quantity": "3", "Id": "975464139", "Client_Store_sk": "1109"}, {"Amount": "2126", "Quantity": "2", "Id": "975464805", "Client_Store_sk": "1109"}, {"Amount": "2133", "Quantity": "1", "Id": "975464061", "Client_Store_sk": "1109"}, {"Amount": "1339", "Quantity": "4", "Id": "974919458", "Client_Store_sk": "1109"}, {"Amount": "1196", "Quantity": "5", "Id": "974920538", "Client_Store_sk": "1109"}, {"Amount": "1198", "Quantity": "4", "Id": "975463638", "Client_Store_sk": "1109"}, {"Amount": "1345", "Quantity": "4", "Id": "974919522", "Client_Store_sk": "1109"}, {"Amount": "1347", "Quantity": "2", "Id": "974919563", "Client_Store_sk": "1109"}, {"Amount": "673", "Quantity": "2", "Id": "975464359", "Client_Store_sk": "1109"}, {"Amount": "2153", "Quantity": "1", "Id": "975464511", "Client_Store_sk": "1109"}, {"Amount": "3896", "Quantity": "4", "Id": "977289342", "Client_Store_sk": "1109"}, {"Amount": "3897", "Quantity": "4", "Id": "974920602", "Client_Store_sk": "1109"}] 

I use

  curl -XPOST localhost:9200/index_local/my_doc_type/_bulk --data-binary --data @/home/data1.json 

When I try to use the standard mass index API from Elasticsearch, I get this error

  error: {"message":"ActionRequestValidationException[Validation Failed: 1: no requests added;]"} 

Can anyone help with indexing this type of JSON?

+18
source share
3 answers

What you need to do is read this JSON file and then build the bulk request in the format expected by _bulk endpoint , i.e. one line for the command and one line for the document, separated by a newline ... rinse and repeat for each document:

 curl -XPOST localhost:9200/your_index/_bulk -d ' {"index": {"_index": "your_index", "_type": "your_type", "_id": "975463711"}} {"Amount": "480", "Quantity": "2", "Id": "975463711", "Client_Store_sk": "1109"} {"index": {"_index": "your_index", "_type": "your_type", "_id": "975463943"}} {"Amount": "2105", "Quantity": "2", "Id": "975463943", "Client_Store_sk": "1109"} ... etc for all your documents ' 

Just replace your_index and your_type with the actual names of the indexes and types you are using.

UPDATE

Please note that you can shorten the command line by removing _index and _type if specified in your URL. You can also remove _id if you specify the path to your id field in your mapping (note that this function will be deprecated in ES 2.0). At least your command line might look like {"index":{}} for all documents, but it will always be required to indicate which operation you want to perform (in this case, index document)

UPDATE 2

 curl -XPOST localhost:9200/index_local/my_doc_type/_bulk --data-binary @/home/data1.json 

/home/data1.json should look like this:

 {"index":{}} {"Amount": "480", "Quantity": "2", "Id": "975463711", "Client_Store_sk": "1109"} {"index":{}} {"Amount": "2105", "Quantity": "2", "Id": "975463943", "Client_Store_sk": "1109"} {"index":{}} {"Amount": "2107", "Quantity": "3", "Id": "974920111", "Client_Store_sk": "1109"} 
+32
source

Today, 6.1.2 is the latest version of ElasticSearch, and the curl command, which works for me on Windows (x64),

 curl -s -XPOST localhost:9200/my_index/my_index_type/_bulk -H "Content-Type: application/x-ndjson" --data-binary @D:\data\mydata.json 

The format of the data that should be present in mydata.json remains the same as shown in @val's answer

+4
source

A valid Elasticsearch API bulk request will look something like this (ends with a new line):

POST http: // localhost: 9200 / products_slo_development_temp_2 / productModel / _bulk

 { "index":{ } } {"RequestedCountry":"slo","Id":1860,"Title":"Stol"} { "index":{ } } {"RequestedCountry":"slo","Id":1860,"Title":"Miza"} 

API documentation for Elasticsearch: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html

This is how i do it

I am sending an HTTP POST request, with uri valiable as the URI / URL of the HTTP request and the elasticsearchJson variable in JSON, passed in the body of the HTTP request, formatted for Elasticsearch bulk api:

 var uri = @"/" + indexName + "/productModel/_bulk"; var json = JsonConvert.SerializeObject(sqlResult); var elasticsearchJson = GetElasticsearchBulkJsonFromJson(json, "RequestedCountry"); 

A helper method for generating the required json format for the bulk Elasticsearch API:

 public string GetElasticsearchBulkJsonFromJson(string jsonStringWithArrayOfObjects, string firstParameterNameOfObjectInJsonStringArrayOfObjects) { return @"{ ""index"":{ } } " + jsonStringWithArrayOfObjects.Substring(1, jsonStringWithArrayOfObjects.Length - 2).Replace(@",{""" + firstParameterNameOfObjectInJsonStringArrayOfObjects + @"""", @" { ""index"":{ } } {""" + firstParameterNameOfObjectInJsonStringArrayOfObjects + @"""") + @" "; } 

The first property / field in my JSON object is the RequestedCountry property, so I use it in this example.

productModel is my document type productModel . sqlResult is a generic C # list of products.

0
source

Source: https://habr.com/ru/post/1234509/


All Articles