Logstash + Kibana term panel without word break

I have a Java application that writes to a log file in json format. The fields that go into the logs are variable. Logstash reads this log file and sends it to Kibana.

I installed logstash with the following file:

input { file { path => ["[log_path]"] codec => "json" } } filter{ json { source => "message" } date { match => [ "data", "dd-MM-yyyy HH:mm:ss.SSS" ] timezone => "America/Sao_Paulo" } } output { elasticsearch_http { flush_size => 1 host => "[host]" index => "application-%{+YYYY.MM.dd}" } } 

I managed to correctly show everything in Kiban without any comparison. But when I try to create a term panel to show the number of servers that sent these messages, I have a problem. I have a field called a server in my json that shows the server name (for example: a1-name-server1), but the term bar separates the server name due to a "-". Also, I would like to count the number of times the error message appears, but the same problem occurs because the term bar splits the error message due to spaces.

I am using Kibana 3 and Logstash 1.4. I searched a lot on the Internet and did not find a solution. I also tried using .raw from logstash, but that didn't work.

How can i do this?

Thanks for the help.

+5
source share
2 answers

Your problem is that your data is marked. This is useful for any search on your data. ES (by default) will split your message field into different parts so that they can be searched. For example, you can search for the word ERROR in your logs, so you probably would like to see results in messages such as "There was an error in your cluster" or " Error . If you do not analyze the data for this field with tokenizers , you will not You can search as follows.

This analyzed behavior is useful when you want to search for things, but does not allow grouping when different messages have the same content. This is your business. The solution to this is to update your mapping by putting not_analyzed for that particular field that you don't want to split into tokens. This will probably work for your host field, but will probably break the search.

What I usually do for such situations is to use index patterns and multifields . The index pattern allows me to set up a mapping for each index that matches the regular expression, and multilevel fields allow me to have the behavior analyzed and not_analyzed in one field.

Using the following query will complete the task for your problem:

 curl -XPUT https://example.org/_template/name_of_index_template -d ' { "template": "indexname*", "mappings": { "type": { "properties": { "field_name": { "type": "multi_field", "fields": { "field_name": { "type": "string", "index": "analyzed" }, "untouched": { "type": "string", "index": "not_analyzed" } } } } } }' 

And then in the terms panel, you can use field.untouched to look at the entire content of the field when calculating the number of different elements.

If you don’t want to use index templates (perhaps your data is in the same index), setting a mapping to the Put Mapping API would do the job. And if you use multifields, there is no need to reindex the data, because from the moment a new mapping is set for the index, the new data will be duplicated in these two subfields ( field_name and field_name.untouched ). If you simply change the display from analyzed to not_analyzed , you will not be able to see any changes until you flip all your data.

+4
source

Since you did not define a mapping in elasticsearch, default settings apply for every field of your type in your index. The default settings for string fields (for example, your server fields) is to analyze the field, which means that a search by elasticity will mean the contents of the field. That is why its dividing your server names into parts.

You can overcome this problem by specifying a mapping. You do not need to define all of your fields, but only those that you do not want to parse elasticsearch. In your particular case, sending the following put command will do the trick:

 http://[host]:9200/[index_name]/_mapping/[type] { "type" : { "properties" : { "server" : {"type" : "string", "index" : "not_analyzed"} } } } 

You cannot do this at an existing index, because switching from parsed to not_analyzed is a significant change in display.

+1
source

Source: https://habr.com/ru/post/1206903/


All Articles