Using Flume to receive real-time log data from a remote server (which does not have Flume) on the same network

Question

Using Flume to receive real-time log data from a remote server (which does not have Flume) on the same network

I have an X server in which Hadoop and Flume are installed, and I have a Y server that does not have any of them, but is on the same network.

Server Y currently stores data in a log file, which is continuously written by two until a date stamp is added at the end of the day and a new log file is started.

The goal is to stream the logs from server Y to server X using a tray, process the data and put it in HDFS.

I believe the best way is for the syslog daemon in server Y to forward these events over TCP, but there are many hoops to go through the organization to even know if this can be done. Another option would be (Option 2 :) to somehow read from a file in a directory in Server Y or (Option 3 :) mount the directory on Server X, treating the directory as a spooling directory. The problem with option 2 is that there is no stream installed on server Y, and that is out of the question. The problem with options 2 and 3 is that incoming information may be inactive and data may be lost during transitions at the end of each day. In addition, there is an authentication problem associated with registering on server Y with a separate username and password. We obviously cannot convert the information into a connection configuration.

My main question is: do I need to install Flume on the source server for this setting to work? Can flume agent run only on server X? Which option is ideal?

+5

logging hadoop streaming syslog flume

just4lizzy Feb 09 '18 at 17:14

source share

No one has answered this question yet.

See related questions:

nine

Transfer files from a remote node device to HDFS using Flume

1

Flux source reference Source: cannot download files with large files

1

Deserializing Json file and immersion in HDFS using tray