Spark-core 1.6.1 & lift-json 2.6.3 java.lang.NoClassDefFoundError

I have a Spark application that has an sbt file as shown below.
It works on my local machine. But when I send it to EMR with the launch of Spark 1.6.1, an error has occurred, as shown below:

java.lang.NoClassDefFoundError: net/liftweb/json/JsonAST$JValue 

I use "sbt-package" to get jar

Build.sbt:

 organization := "com.foo" name := "FooReport" version := "1.0" scalaVersion := "2.10.6" libraryDependencies ++= Seq( "org.apache.spark" %% "spark-core" % "1.6.1" ,"net.liftweb" % "lift-json_2.10" % "2.6.3" ,"joda-time" % "joda-time" % "2.9.4" ) 

Do you have any idea what is going on?

+1
source share
1 answer

I found a solution and it works!

The problem was the sbt package , which does not include all dependent jars for jar output. To overcome this, I tried sbt-assembly , but I got a lot of "deduplicating" errors when I ran it.

In the end, I came up with this blog post, which is all clear and clear. http://queirozf.com/entries/creating-scala-fat-jars-for-spark-on-sbt-with-sbt-assembly-plugin

To send Spark jobs to Spark Cluster (via spark-submit), you need to include all the dependencies (except Spark himself) in the Jar, otherwise you will not be able to use them in your work.

  • Create "assembly.sbt" in the / project folder.
  • Add this line addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.3")
  • Then paste the assemblyMergeStrategy code below into the build.sbt file

assemblyMergeStrategy in assembly := { case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last case PathList("javax", "activation", xs @ _*) => MergeStrategy.last case PathList("org", "apache", xs @ _*) => MergeStrategy.last case PathList("com", "google", xs @ _*) => MergeStrategy.last case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.last case PathList("com", "codahale", xs @ _*) => MergeStrategy.last case PathList("com", "yammer", xs @ _*) => MergeStrategy.last case "about.html" => MergeStrategy.rename case "META-INF/ECLIPSEF.RSA" => MergeStrategy.last case "META-INF/mailcap" => MergeStrategy.last case "META-INF/mimetypes.default" => MergeStrategy.last case "plugin.properties" => MergeStrategy.last case "log4j.properties" => MergeStrategy.last case x => val oldStrategy = (assemblyMergeStrategy in assembly).value oldStrategy(x) }

And run sbt assembly

Now you have a big thick jar that has all the dependencies. It can be hundreds of MB based on dependent libraries. For my case, I use Aws EMR, which Spark 1.6.1 is already installed on it. To exclude the spark core from your can, you can use the keyword "provided":

 "org.apache.spark" %% "spark-core" % "1.6.1" % "provided" 

Here is the final build.sbt file:

 organization := "com.foo" name := "FooReport" version := "1.0" scalaVersion := "2.10.6" libraryDependencies ++= Seq( "org.apache.spark" %% "spark-core" % "1.6.1" % "provided" ,"net.liftweb" % "lift-json_2.10" % "2.6.3" ,"joda-time" % "joda-time" % "2.9.4" ) assemblyMergeStrategy in assembly := { case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last case PathList("javax", "activation", xs @ _*) => MergeStrategy.last case PathList("org", "apache", xs @ _*) => MergeStrategy.last case PathList("com", "google", xs @ _*) => MergeStrategy.last case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.last case PathList("com", "codahale", xs @ _*) => MergeStrategy.last case PathList("com", "yammer", xs @ _*) => MergeStrategy.last case "about.html" => MergeStrategy.rename case "META-INF/ECLIPSEF.RSA" => MergeStrategy.last case "META-INF/mailcap" => MergeStrategy.last case "META-INF/mimetypes.default" => MergeStrategy.last case "plugin.properties" => MergeStrategy.last case "log4j.properties" => MergeStrategy.last case x => val oldStrategy = (assemblyMergeStrategy in assembly).value oldStrategy(x) } 
0
source

Source: https://habr.com/ru/post/1232597/


All Articles