What is the concept of application, work, stage and task in sparks?

Question

As far as I understand?

Application: one spark submit.
Work: after a lazy assessment, there is work.
stage: This is due to shuffle and type of conversion. It’s hard for me to understand the boundary of the scene.
Task: This is a device operation. One conversion per task. One task for each conversion.

Help wanted to improve this understanding.

+5

cdhit Feb 16 '17 at 1:35

1 answer

rakesh · Answer 1 · 2017-02-16T06:44:40+0000

The main function is the application.

When you invoke an action on an RDD, a "job" is created. Work is a work submitted by Spark.

The works are divided into "stages" based on the shuffle border. This will help you understand.

Each step is further divided into tasks based on the number of sections in the RDD. Thus, tasks are the smallest units of work for Spark.