Spark Shell Cannot Find Hbase Class

I am trying to upload data from HDFS to an Hbase table using Spark Streaming. I put the data in an HDFS time file and read it using the textFileStream function. Since the spark does not have hbase tanks in the classpath, this gives me an error even when importing Hbase cans in the spark shell.

scala> import org.apache.hadoop.hbase.mapred.TableOutputFormat <console>:10: error: object hbase is not a member of package org.apache.hadoop import org.apache.hadoop.hbase.mapred.TableOutputFormat But if i add the hbase jars in the classpath while starting the spark shell then i am not getting any error. But it is still not able to find certain classes down the path. bin/spark-shell --jars /hbase/hbase-0.94.13/hbase-0.94.13-mapr-1401.jar scala> import org.apache.hadoop.hbase.{ HBaseConfiguration, HColumnDescriptor, HTableDescriptor } import org.apache.hadoop.hbase.{HBaseConfiguration, HColumnDescriptor, HTableDescriptor} scala> import org.apache.hadoop.hbase.client.{ HBaseAdmin, Put } import org.apache.hadoop.hbase.client.{HBaseAdmin, Put} scala> import org.apache.hadoop.hbase.io.ImmutableBytesWritable import org.apache.hadoop.hbase.io.ImmutableBytesWritable scala> import org.apache.hadoop.hbase.mapred.TableOutputFormat import org.apache.hadoop.hbase.mapred.TableOutputFormat scala> import org.apache.hadoop.hbase.mapreduce.TableInputFormat import org.apache.hadoop.hbase.mapreduce.TableInputFormat scala> import org.apache.hadoop.hbase.util.Bytes import org.apache.hadoop.hbase.util.Bytes scala> import org.apache.hadoop.mapred.JobConf import org.apache.hadoop.mapred.JobConf scala> import org.apache.spark.SparkContext import org.apache.spark.SparkContext scala> import org.apache.spark.rdd.{ PairRDDFunctions, RDD } import org.apache.spark.rdd.{PairRDDFunctions, RDD} scala> import org.apache.spark.streaming._ import org.apache.spark.streaming._ scala> import org.apache.spark.streaming.StreamingContext._ import org.apache.spark.streaming.StreamingContext._ scala> import org.apache.hadoop.hbase.client.mapr.{BaseTableMappingRules} import org.apache.hadoop.hbase.client.mapr.BaseTableMappingRules scala> val conf = HBaseConfiguration.create() conf: org.apache.hadoop.conf.Configuration = Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, hbase-default.xml, hbase-site.xml scala> val hbaseTableName = "/app/dev/MarketingIt/hbasetables/spark_test" hbaseTableName: String = /app/dev/MarketingIt/hbasetables/spark_test scala> val admin = new HBaseAdmin(conf) java.lang.RuntimeException: java.io.IOException: java.lang.RuntimeException: Error occurred while instantiating com.mapr.fs.MapRTableMappingRules. ==> org/apache/hadoop/hbase/client/mapr/BaseTableMappingRules. at org.apache.hadoop.hbase.client.HBaseAdmin.commonInit(HBaseAdmin.java:356) at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:156) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:31) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:36) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:38) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:40) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:42) at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:44) at $iwC$$iwC$$iwC$$iwC.<init>(<console>:46) at $iwC$$iwC$$iwC.<init>(<console>:48) at $iwC$$iwC.<init>(<console>:50) at $iwC.<init>(<console>:52) at <init>(<console>:54) at .<init>(<console>:58) at .<clinit>(<console>) at .<init>(<console>:7) at .<clinit>(<console>) at $print(<console>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:788) at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1056) at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:614) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:645) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:609) at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:796) at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:841) at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:753) at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:601) at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:608) at org.apache.spark.repl.SparkILoop.loop(SparkILoop.scala:611) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:936) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:884) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:884) at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:884) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:982) at org.apache.spark.repl.Main$.main(Main.scala:31) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:303) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.io.IOException: java.lang.RuntimeException: Error occurred while instantiating com.mapr.fs.MapRTableMappingRules. ==> org/apache/hadoop/hbase/client/mapr/BaseTableMappingRules. at org.apache.hadoop.hbase.client.mapr.TableMappingRulesFactory.create(TableMappingRulesFactory.java:65) at org.apache.hadoop.hbase.client.HBaseAdmin.commonInit(HBaseAdmin.java:348) ... 47 more Caused by: java.lang.RuntimeException: Error occurred while instantiating com.mapr.fs.MapRTableMappingRules. ==> org/apache/hadoop/hbase/client/mapr/BaseTableMappingRules. at org.apache.hadoop.hbase.client.mapr.GenericHFactory.getImplementorInstance(GenericHFactory.java:40) at org.apache.hadoop.hbase.client.mapr.TableMappingRulesFactory.create(TableMappingRulesFactory.java:47) ... 48 more Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/client/mapr/BaseTableMappingRules at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:800) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) at java.net.URLClassLoader.access$100(URLClassLoader.java:71) at java.net.URLClassLoader$1.run(URLClassLoader.java:361) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:412) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:190) at org.apache.hadoop.hbase.client.mapr.GenericHFactory.getImplementorInstance(GenericHFactory.java:30) ... 49 more Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.client.mapr.BaseTableMappingRules at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) ... 65 more Here as you can see i have added all the hbase jars and spark is able to find some of the hbase classes and cant find some All the classes are in the same jar i added. Since it is saying Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.client.mapr.BaseTableMappingRules. I have imported that class specifically. But i still get the same error. 
+5
source share
2 answers

If you are using Spark 1. +, try setting the optional Path Path property in your Spark configuration

Add this line to spark-defaults.conf -

spark.executor.extraClassPath / opt / cloudera / parcels / CDH / lib / hive / lib / hive -hbase-handler.jar: /opt/cloudera/parcels/CDH/lib/hbase/hbase-server.jar: / opt / cloudera / parcels / CDH / lib / hbase / hbase-protocol.jar: /opt/cloudera/parcels/CDH/lib/hbase/hbase-hadoop2-compat.jar: / opt / cloudera / parcels / CDH / lib / hbase / hbase-client.jar: /opt/cloudera/parcels/CDH/lib/hbase/hbase-common.jar: /opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core.jar

If you are using a different distribution, find the appropriate path for the jar files.

In addition to changing the configuration, add the driver class path to the spark shell or when sending your spark work as -

- driver-class-way
/opt/cloudera/parcels/CDH/lib/hbase/hbase-server.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-protocol.jar:/opt/cloudera/parcels/CDH/lib/hbase /hbase-hadoop2-compat.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-client.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-common.jar:/opt/ Cloudera / Parcels / CDH / Library / HBase / Library / htrace-core.jar

You can add jar files to the class’s spark path in spark-env.sh to avoid specifying the full path every time you want to launch the spark shell or submit the spark job, but I run into other problems when using this approach, I found the options above to work better for me.

export SPARK_CLASSPATH = / opt / Cloudera / parcel / CDH / Library / HBase / HBase-server.jar: /opt/cloudera/parcels/CDH/lib/hbase/hbase-protocol.jar: / opt / Cloudera / parcel / CDH / Library / hbase / hbase-hadoop2-compat.jar: /opt/cloudera/parcels/CDH/lib/hbase/hbase-client.jar: /opt/cloudera/parcels/CDH/lib/hbase/hbase-common.jar: / select/Cloudera/packages/CDH/Library/HBase/Library/htrace-core.jar

Nothing more needed for Spark 1. +

If you are using Spark 0.9, see this link - yes, the links may break, but I have not tested Spark 0.9, and this blog contains useful information http://www.abcn.net/2014/07/lighting-spark-with- hbase-full-edition.html

+2
source

Add line to conf / spark-env.sh

Replace $ {HBASE_HOME} with the full path

export SPARK_CLASSPATH = $ {HBASE_HOME} / lib / *

+2
source

Source: https://habr.com/ru/post/1206057/


All Articles