When
This occurs when an operation requiring shuffling is first evaluated (action) and cannot be disabled
Why
This is an optimization. Shuffling is one of the expensive things that happen in Spark.
How can this be reused in further calculations?
It is automatically reused with any subsequent action performed on the same RDD.
user6022341