Spark Streaming Job Saves Growing Memory

I am running spark v 1.6.1 on one machine offline, having 64GB RAM and 16cores.

I created five working instances to create five artists, as in offline mode, there can not be more than one artist in one working node.

Configuration:

SPARK_WORKER_INSTANCES 5

SPARK_WORKER_CORE 1

SPARK_MASTER_OPTS "-Dspark.deploy.default.Cores = 5"

all other configurations are specified by default in spark_env.sh

I start direct direct work of kafka with an interval of 1 min, which takes data from kafka and after some aggregation writes data to mango.

Problems:

master slave, - . 212 . , 5 - 1 , 8 () , .

rdd, spark.cleaner.ttl 600. .

, SPARK-1706, work.and spark_env.sh, , , YARN.

,

+4

Source: https://habr.com/ru/post/1650255/


All Articles