MapReduce in Kassandra

I am working on a simple MapReduce program to read data from the Cassandra column family, but have encountered the following errors. Any tips on how to proceed are much appreciated. Thanks in advance!

Cassandra version : 1.0.3 Hadoop version : 0.20.2 HADOOP_CLASSPATH has: apache-cassandra-1.0.3.jar, libthrift-0.6.jar, commons-lang-2.4.jar and guava-10.0.1.jar What works : Hadoop MR word count example, Reads from Cassandra column family using cassandra-cli, Thrift and Hector 

Error:

 11/12/01 20:05:23 INFO mapred.JobClient: Running job: job_201112010835_0009<br/> 11/12/01 20:05:24 INFO mapred.JobClient: map 0% reduce 0%<br/> 11/12/01 20:05:33 INFO mapred.JobClient: Task Id : attempt_201112010835_0009_m_000000_0, Status : FAILED<br/> Error: java.lang.ClassNotFoundException: com.google.common.collect.AbstractIterator<br/> at java.net.URLClassLoader$1.run(URLClassLoader.java:202)<br/> at java.security.AccessController.doPrivileged(Native Method)<br/> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)<br/> at java.lang.ClassLoader.loadClass(ClassLoader.java:306)<br/> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)<br/> at java.lang.ClassLoader.loadClass(ClassLoader.java:247)<br/> at java.lang.ClassLoader.defineClass1(Native Method)<br/> at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)<br/> at java.lang.ClassLoader.defineClass(ClassLoader.java:615)<br/> at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)<br/> at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)<br/> at java.net.URLClassLoader.access$000(URLClassLoader.java:58)<br/> at java.net.URLClassLoader$1.run(URLClassLoader.java:197)<br/> at java.security.AccessController.doPrivileged(Native Method)<br/> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)<br/> at java.lang.ClassLoader.loadClass(ClassLoader.java:306)<br/> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)<br/> at java.lang.ClassLoader.loadClass(ClassLoader.java:247)<br/> at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:158)<br/> at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:418)<br/> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620)<br/> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)<br/> at org.apache.hadoop.mapred.Child.main(Child.java:170)<br/> 
+4
source share
2 answers

Are you adding cassandra libraries to the classpath for all your task trackers? On the wiki page http://wiki.apache.org/cassandra/HadoopSupport :

 One configuration note on getting the task trackers to be able to perform queries over Cassandra: you'll want to update your HADOOP_CLASSPATH in your <hadoop>/conf/hadoop-env.sh to include the Cassandra lib libraries. For example you'll want to do something like this in the hadoop-env.sh on each of your task trackers: export HADOOP_CLASSPATH=/opt/cassandra/lib/*:$HADOOP_CLASSPATH 

The path in this example should obviously be replaced with the correct path to the cassandra libraries of your system.

+4
source

1) You must make sure that the version of all Cassandra cans (all the cans that you find in the lib cascandra binary installtion directory) that you use for Hadoop is compatible with the version of Hadoop.

2) Cassandra cans may also have some dependencies, and you must have the correct version of these cans of dependency.

3) I encountered the same problem due to such a mismatch of the jars.Resolved version after clearing all Cassandra cans from the hadoop class path and adding all cans from the latest Cassandra version.

0
source

Source: https://habr.com/ru/post/1384200/


All Articles