I am trying to run a word count example in Spark that will transfer data from Kafka.
The source code is here . However, I found that the distribution of Cloudera Spark is slightly different from the incubator. I have no problem launching a spark shell and examples to run words. However, there is no “run-example" script in the bin folder that is specified in the source code of the example.
* Example:
* `./bin/run-example org.apache.spark.streaming.examples.JavaKafkaWordCount local[2] zoo01,zoo02,
* zoo03 my-consumer-group topic1,topic2 1`
I am new to jar, but I know that to run a java program on the command line you need to pack all the dependencies and code, compile and put it in a jar file. And then run the jar file as a whole, and I think this is what the run-example code will do .
Can someone tell me how I can run the KafkaWordCount.java example without a sample example script?
A similar question is here, but I do not want to run Java code in a spark shell every time.
Many thanks.
Hadoop: I have a Cloudera Hadoop Distribution (CDH 4.6.0-1.cdh4.6.0.p0.26), which is managed by the Cloudera manager,
Spark: I downloaded the package (SPARK 0.9.0-1.cdh4.6.0.p0.50), and also distributed and activated this service.
Kafka: kafka-0.8.0, I downloaded the source and drove it out of the source.
source
share