We have been working with ElasticSearch 2.x for quite some time. Everything fully meets our requirements, except for one weak point: the performance of writing / indexing to the ElasticSearch cluster is not very good.
In our case, we have 8 nodes of the ES cluster, these are 100 field fields that we enter in the ES. Indexing is around 50,000 per minute, which is too slow for our scenario. We have tried all the setup methods recommended by www.elastic.co. The fastest way we found is to build the json payload as files, they upload them to ES using the bulk API. Nevertheless, the indexing speed is too slow.
I saw several ES-Hadoop connectors, also elasticsearch has spark support where you can use saveToES () to save RDD to ES. I suspect they all use the ES API. Can anyone share experience with them? What is the fastest way to write indexes in ElasticSearch?
source
share