Sudden increase in planning delay for spark flow Operation without changing other parameters

I have a spark flow job working in production for 1 sec. I am using CDH 5.5 Spark 1.5. We use Kafka Create Directstream. We have included Back Pressure. We do not want to require dynamic allocation. Thus, the task is performed with the number of corrections of the contractor.

From the image below, I see that this is a sudden increase in planning delay from 13.50. But at the same time I do not see any process processing time.

  • What could be the reason for the increase in planning time when the processing time is the same.
  • Other load loads in the cluster affect the current streaming job. In my opinion, this should not be so, because the artists for streaming are pre-allocated and already running.

Any thoughts?

enter image description here

+4
source share
1 answer

This is actually a strange problem, but let's move on to this point Does other job loads in the cluster effect the current streaming job. The answer is that the cpu share will be affected if another process starts working in the same cluster and can lead to a conflict in which you will see the wait. Do you accidentally use a spark in a container? Its also hard to fully understand your problem, since I don't know how you are setting up your cluster.

0
source

Source: https://habr.com/ru/post/1659447/


All Articles