Spark MLlib 0.91 org.jblas.DoubleMatrix errors

I am using spark 0.91 with MLlib 0.91 in DSE

When you try to run the following code offline

 val parsedData = sc.parallelize((1 to 1000). map { line => LabeledPoint(0.0, Array(0.0, 0.4, 0.3)) }) val numIterations = 2 val model = LinearRegressionWithSGD.train(parsedData, numIterations) 

I get this error:

  14/09/20 14:28:37 ERROR OneForOneStrategy: org.jblas.DoubleMatrix cannot be cast to org.jblas.DoubleMatrix java.lang.ClassCastException: org.jblas.DoubleMatrix cannot be cast to org.jblas.DoubleMatrix at org.apache.spark.mllib.optimization.GradientDescent$$anonfun$runMiniBatchSGD$1$$anonfun$2.apply(GradientDescent.scala:150) at org.apache.spark.mllib.optimization.GradientDescent$$anonfun$runMiniBatchSGD$1$$anonfun$2.apply(GradientDescent.scala:150) at org.apache.spark.rdd.RDD$$anonfun$6.apply(RDD.scala:677) at org.apache.spark.rdd.RDD$$anonfun$6.apply(RDD.scala:674) at org.apache.spark.scheduler.JobWaiter.taskSucceeded(JobWaiter.scala:56) at org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:846) at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:601) 

This only happens when you try to start a stand-alone application. It works on a spark shell (dse spark). Any ideas?

Update:

When I create an object in REPL, getClassLoader returns:

 scala> new org.jblas.DoubleMatrix().getClass().getClassLoader() res3: ClassLoader = ModuleClassLoader:Analytics 

But when I start offline mode (with spark class), it returns

 new org.jblas.DoubleMatrix().getClass().getClassLoader(): class= SystemClassLoader 

Perhaps this is a clue.

I use SBT to create a jar and ship it using a spark class. Here is the configuration

 name := "analytics" version := "1.0" scalaVersion := "2.10.3" unmanagedJars in Compile ++= Attributed.blankSeq((file("./dse/lib/") * "*.jar").get) unmanagedJars in Compile ++= Attributed.blankSeq((file("./dse/resources/spark/lib/") * "*.jar").get) unmanagedJars in Compile ++= Attributed.blankSeq((file("./dse/resources/cassandra/lib/") * "*.jar").get) unmanagedJars in Runtime ++= Attributed.blankSeq((file("./dse/resources/hadoop/") * "*.jar").get) unmanagedJars in Runtime ++= Attributed.blankSeq((file("./dse/resources/hadoop/lib/") * "*.jar").get) unmanagedJars in Compile ++= Attributed.blankSeq((file("./dse/resources/driver/lib/") * "*.jar").get) 

Update 2: Used the configuration of the dse demos to build and deploy using ant, but again I encounter the same error

+5
source share
1 answer

This is really a download problem. In particular, I believe that you are pushing this error , fixed in 1.0.

You cannot throw a class object loaded by one class loader into another class loader.

There is little chance that you can find a solution by manually changing the context class loader. This requires that you can get a link to the appropriate classloader, which may or may not be possible in your case. Sort of:

 Thread.currentThread().setContextClassloader(...) 

But since I don't know anything about DSE, I will have to link to this article: http://www.datastax.com/dev/blog/classloading-in-dse-analytics

+1
source

Source: https://habr.com/ru/post/1203033/


All Articles