In the center, Cascading is a higher-level API on top of engines like MapReduce. In this sense, it is similar to Apache Crunch. Cascading has several other related projects, such as the Scala version (Scalding) and the PMML evaluation (template).
Apache Spark is similar in the sense that it provides a high-level API for data pipelines and one that is available in Java and Scala.
This is more likely the execution mechanism than a layer on top of one. He has a number of related projects, such as MLlib, Streaming, GraphX, for ML, stream processing, graph calculation.
All in all, I find Spark much more interesting today, but they are not exactly the same.
source share