Best option to put Nginx magazines in Kafka?

We are dealing with large log files from several servers that we add to HDFS. We currently have a good batch solution (basically moving and writing files every day) and you want to implement a real-time solution with Kafka.

Basically, we need to put the logs from Nginx into Kafka, and then write to the consumer for recording on HDFS (this can be done with the HDFS consumer https://github.com/kafka-dev/kafka/tree/master/contrib/hadoop-consumer )

Which approach would you recommend moving magazines to Kafka?

  • We could write the nginx module, but it is not so simple. This https://github.com/DemandCube/Sparkngin may give some clues.
  • Reading the log files (tail ...) seems like a bad idea as there is a useless write operation. Logstash will also require write / read operations before clicking on Kafka, which seems unnecessary.

any other idea?

+6
source share
3 answers

I know this is an old question. But lately I have to do the same too.

The problem of tail -f producer is the rotation of the log, and when the tail dies, you really don't know which lines were sent to Kafka.

Starting with nginx 1.7.1, the access_log directive can be written to syslog. See http://nginx.org/en/docs/syslog.html . We use this to log into rsyslog and with rsyslog in Kafka. http://www.rsyslog.com/doc/master/configuration/modules/omkafka.html

A little bit around - about how to do this, but in this way, no less chance of no magazines. In addition, if you use CentOS, rsyslog will always be standard.

In short, here's the setup I consider the best option for writing nginx log in kafka:

nginx → rsyslog → kafka

+4
source

Have you tried using Logstash for this purpose? It works great on our systems.

Logstash is a component that you just need to unzip and configure some small parameters, such as the Kafka endpoint and port, the theme and format you want to transfer, and from which file.

+3
source

Try the linux feeds of tail + kafkacat .

Reading messages from stdin, creating syslog theme with instant compression

 tail -F /var/log/syslog | kafkacat -b mybroker -t syslog -z snappy 

Be careful to use -F instead of -f, -F is capable of handling the log rotation problem . -F continue to try to open the file, even if it is or becomes inaccessible; useful if followed by name.

+1
source

Source: https://habr.com/ru/post/974229/


All Articles