Logstash output to elasticsearch using document_id; What should I do if I donโ€™t have document_id?

I have several logstash entries where I use document_id to remove duplicates. However, most input does not have document_id . The following discards the actual document_id through, but if it does not exist, it is taken literally %{document_id} , which means that most documents are treated as duplicates of each other. This is what my output block looks like:

 output { elasticsearch_http { host => "127.0.0.1" document_id => "%{document_id}" } } 

I thought I could use conditional output. It does not work, and the error is shown below the code.

 output { elasticsearch_http { host => "127.0.0.1" if document_id { document_id => "%{document_id}" } } } Error: Expected one of #, => at line 101, column 8 (byte 3103) after output { elasticsearch_http { host => "127.0.0.1" if 

I tried several if statements and they all fail, so I assume the problem has a conditional expression of any type in this block. Here are the alternatives I tried:

 if document_id <> "" { if [document_id] <> "" { if [document_id] { if "hello" <> "" { 
+6
source share
2 answers

You are close to the conditional idea, but you cannot place it inside the plugins block. Do this instead:

 output { if [document_id] { elasticsearch_http { host => "127.0.0.1" document_id => "%{document_id}" } } else { elasticsearch_http { host => "127.0.0.1" } } } 

(But the suggestion in one of the other answers for using the uuid filter is also good.)

+9
source

One way to solve this problem is to make sure document_id always available. This can be done by adding a UUID filter to the filter section, which would create a document_id field if it is missing.

 filter { if "" in [document_id] { uuid { target => "document_id" } } } 

Edited Magnus Back's suggestion. Thanks!

+5
source

Source: https://habr.com/ru/post/987167/


All Articles