Resource development Logstash / Elasticsearch / Kibana

Question

Resource development Logstash / Elasticsearch / Kibana

How to plan resources (I suspect elasticsearch instances) depending on load:

With a load, I mean ≈500K events / min, each of which contains 8-10 fields.

What settings should I rotate? I am new to this stack.

+1

elasticsearch logstash kibana high-load

inteloid May 19, '15 at 17:00

source share

1 answer

Alain collins · Accepted Answer · 2015-05-20T03:00:42+0000

500,000 events per minute - 8333 events per second, which should be quite easy for a small cluster (3-5 machines) to process.

The problem will be that 720M daily documents will be open for 60 days (43B documents). If each of the 10 fields is 32 bytes, this is 13.8 TB of disk space (about 28 TB with a single replica).

For comparison, I have 5 nodes with a maximum capacity (64 GB of RAM, 31 gigabytes), with 1.2B documents consuming 1.2 TB of disk space (double the replica). This cluster could not handle the load with only 32 GB of RAM per machine, but now it is happy with 64 GB. This is 10 days of data for us.

Roughly, you expect to have a 40x number of documents consuming 10x of disk space than my cluster.

I don’t have the exact numbers in front of me, but our pilot project on using doc_values gives us something like a heap savings of 90%.

If all this math takes place, and doc_values is good, you can be fine with a similar cluster as far as the actual bytes are concerned. I would like more information about the costs of having so many separate documents.

We did some elasticsearch setup, but maybe more than could be done.

I would advise you to start with a few 64 gigabyte machines. You can add more as needed. Throw a few (smaller) client nodes as an interface for index and search queries.

Resource development Logstash / Elasticsearch / Kibana

More articles: