How to report JMX from Spark Streaming to EC2 on VisualVM?

I am trying to get a Spark Streaming job running on an EC2 instance to report VisualVM using JMX.

At the moment, I have the following configuration file:

spark /CONF/metrics.properties:

*.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink master.source.jvm.class=org.apache.spark.metrics.source.JvmSource worker.source.jvm.class=org.apache.spark.metrics.source.JvmSource driver.source.jvm.class=org.apache.spark.metrics.source.JvmSource executor.source.jvm.class=org.apache.spark.metrics.source.JvmSource 

And I start working with the spark stream as follows: (the -D bit, which I added later, in the hope of gaining remote access to ec2 jmx)

terminal :

 spark/bin/spark-submit --class my.class.StarterApp --master local --deploy-mode client \ project-1.0-SNAPSHOT.jar \ -Dcom.sun.management.jmxremote \ -Dcom.sun.management.jmxremote.port=54321 \ -Dcom.sun.management.jmxremote.authenticate=false \ -Dcom.sun.management.jmxremote.ssl=false 
+5
source share
2 answers

The spark-submit command line has two problems:

  • local - you should not run Spark Standalone with local main URL, because there will be no threads to perform your calculations (tasks), and you have two, that is, one for the receiver and the other for the driver. In the logs you should see the following WARN:

WARN StreamingContext: spark.master should be set as local [n], n> 1 in local mode, if you have receivers for receiving data, otherwise Spark jobs will not receive resources for processing the received data.

  1. Parameters
  2. -D not supported by the JVM since they are specified after the Spark Streaming application and in fact become its command line arguments. Put them in front of project-1.0-SNAPSHOT.jar and start over (you must solve this problem first!)
+3
source
 spark-submit --conf "spark.driver.extraJavaOptions=-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=8090 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false"/path/example/src/main/python/pi.py 10000 

Notes: configuration format: --conf "params". tested under spark 2. +

0
source

Source: https://habr.com/ru/post/1209454/


All Articles