As I know, Spark uses memory to cache data, and then to calculate data in memory. But what if the data has more memory? I could read the source code, but I don’t know which class runs the schedule? Or could you explain the principle of how Spark deals with this issue?
source
share