Apache Spark or cascading structure?

Question

Apache Spark or cascading structure?

I am confused about when to use the Cascading framework and when to use Apache Spark. What are the appropriate use cases for each of them?

Any help is appreciated.

+6

java cascading apache-spark

progrrammer Aug 11 '14 at 10:04

source share

1 answer

Sean owen · Accepted Answer · 2014-08-11T10:22:43+0000

In the center, Cascading is a higher-level API on top of engines like MapReduce. In this sense, it is similar to Apache Crunch. Cascading has several other related projects, such as the Scala version (Scalding) and the PMML evaluation (template).

Apache Spark is similar in the sense that it provides a high-level API for data pipelines and one that is available in Java and Scala.

This is more likely the execution mechanism than a layer on top of one. He has a number of related projects, such as MLlib, Streaming, GraphX, for ML, stream processing, graph calculation.

All in all, I find Spark much more interesting today, but they are not exactly the same.

Apache Spark or cascading structure?

More articles: