I have a pyspark streaming application that works with yarn in a Hadoop cluster. The streaming application reads from the Kafka queue every n seconds and makes a REST call.
I have a logging service to provide an easy way to collect and store data, send data to Logstash, and visualize data in Kibana. The data must match the template (JSON with specific keys) provided by this service.
I want to send logs from a streaming application to Logstash using this service. To do this, I need to do two things:
- Collect some data while the streaming app is reading from Kafka and making the REST call. - Format it according to the logging service template. - Forward the log to logstash host.
Any guidance related to this would be very helpful.
Thanks!
source share