I recently set up Airflowto complete my tasks. I have a node wizard and 2 workers who perform my tasks. I want to control my cluster with Graphiteand Grafana. All I did was install Graphiteboth Grafanaon the master node and test it with a simple bash command. Now I want to track my cluster Airflowas I complete the task. I created metrics.propertiesand placed it in spark/conf:
*.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink
*.sink.graphite.host=192.168.2.241
*.sink.graphite.port=2003
*.sink.graphite.period=10
master.source.jvm.class=org.apache.spark.metrics.source.JvmSource
worker.source.jvm.class=org.apache.spark.metrics.source.JvmSource
driver.source.jvm.class=org.apache.spark.metrics.source.JvmSource
executor.source.jvm.class=org.apache.spark.metrics.source.JvmSource
And I added the following flags to mine spark-submit:
--files=/path/to/metrics.properties \
--conf spark.metrics.conf=metrics.properties
, Graphite ui, Graphite->carbon->agents->cluster1-a . , - , Airflow.
, grafana-spark-dashboards? YARN, Airflow.
Carbon s storage-schemas.conf?
Graphite:
[carbon]
pattern = ^carbon\.
retentions = 60:90d
- , Spark Graphite?