I created a simple Spark application using sbt . Here is my code:
import org.apache.spark.sql.SparkSession object HelloWorld { def main(args: Array[String]): Unit = { val spark = SparkSession.builder().master("local").appName("BigApple").getOrCreate() import spark.implicits._ val ds = Seq(1, 2, 3).toDS() ds.map(_ + 1).foreach(x => println(x)) } }
Below is my build.sbt
name := """sbt-sample-app""" version := "1.0" scalaVersion := "2.11.7" libraryDependencies += "org.scalatest" %% "scalatest" % "2.2.6" % "test" libraryDependencies += "org.apache.spark" % "spark-sql_2.11" % "2.1.1"
Now when I try to do sbt run , it gives me the following error:
$ sbt run [info] Loading global plugins from /home/user/.sbt/0.13/plugins [info] Loading project definition from /home/user/Projects/sample-app/project [info] Set current project to sbt-sample-app (in build file:/home/user/Projects/sample-app/) [info] Running HelloWorld Using Spark default log4j profile: org/apache/spark/log4j-defaults.properties 17/06/01 10:09:10 INFO SparkContext: Running Spark version 2.1.1 17/06/01 10:09:11 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 17/06/01 10:09:11 WARN Utils: Your hostname, user-Vostro-15-3568 resolves to a loopback address: 127.0.1.1; using 127.0.0.1 instead (on interface enp3s0) 17/06/01 10:09:11 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 17/06/01 10:09:11 INFO SecurityManager: Changing view acls to: user 17/06/01 10:09:11 INFO SecurityManager: Changing modify acls to: user 17/06/01 10:09:11 INFO SecurityManager: Changing view acls groups to: 17/06/01 10:09:11 INFO SecurityManager: Changing modify acls groups to: 17/06/01 10:09:11 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(user); groups with view permissions: Set(); users with modify permissions: Set(user); groups with modify permissions: Set() 17/06/01 10:09:12 INFO Utils: Successfully started service 'sparkDriver' on port 39662. 17/06/01 10:09:12 INFO SparkEnv: Registering MapOutputTracker 17/06/01 10:09:12 INFO SparkEnv: Registering BlockManagerMaster 17/06/01 10:09:12 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 17/06/01 10:09:12 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 17/06/01 10:09:12 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-c6db1535-6a00-4760-93dc-968722e3d596 17/06/01 10:09:12 INFO MemoryStore: MemoryStore started with capacity 408.9 MB 17/06/01 10:09:13 INFO SparkEnv: Registering OutputCommitCoordinator 17/06/01 10:09:13 INFO Utils: Successfully started service 'SparkUI' on port 4040. 17/06/01 10:09:13 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://127.0.0.1:4040 17/06/01 10:09:13 INFO Executor: Starting executor ID driver on host localhost 17/06/01 10:09:13 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 34488. 17/06/01 10:09:13 INFO NettyBlockTransferService: Server created on 127.0.0.1:34488 17/06/01 10:09:13 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 17/06/01 10:09:13 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 127.0.0.1, 34488, None) 17/06/01 10:09:13 INFO BlockManagerMasterEndpoint: Registering block manager 127.0.0.1:34488 with 408.9 MB RAM, BlockManagerId(driver, 127.0.0.1, 34488, None) 17/06/01 10:09:13 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 127.0.0.1, 34488, None) 17/06/01 10:09:13 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 127.0.0.1, 34488, None) 17/06/01 10:09:14 INFO SharedState: Warehouse path is 'file:/home/user/Projects/sample-app/spark-warehouse'. [error] (run-main-0) scala.ScalaReflectionException: class scala.Option in JavaMirror with ClasspathFilter( [error] parent = URLClassLoader with NativeCopyLoader with RawResources( [error] urls = List(/home/user/Projects/sample-app/target/scala-2.11/classes, ...,/home/user/.ivy2/cache/org.apache.parquet/parquet-jackson/jars/parquet-jackson-1.8.1.jar), [error] parent = java.net.URLClassLoader@7c4113ce , [error] resourceMap = Set(app.class.path, boot.class.path), [error] nativeTemp = /tmp/sbt_c2afce [error] ) [error] root = sun.misc.Launcher$AppClassLoader@677327b6 [error] cp = Set(/home/user/.ivy2/cache/org.glassfish.jersey.core/jersey-common/jars/jersey-common-2.22.2.jar, ..., /home/user/.ivy2/cache/net.razorvine/pyrolite/jars/pyrolite-4.13.jar) [error] ) of type class sbt.classpath.ClasspathFilter with classpath [<unknown>] and parent being URLClassLoader with NativeCopyLoader with RawResources( [error] urls = List(/home/user/Projects/sample-app/target/scala-2.11/classes, ..., /home/user/.ivy2/cache/org.apache.parquet/parquet-jackson/jars/parquet-jackson-1.8.1.jar), [error] parent = java.net.URLClassLoader@7c4113ce , [error] resourceMap = Set(app.class.path, boot.class.path), [error] nativeTemp = /tmp/sbt_c2afce [error] ) of type class sbt.classpath.ClasspathUtilities$$anon$1 with classpath [file:/home/user/Projects/sample-app/target/scala-2.11/classes/,...openjdk-amd64/jre/lib/jfr.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/classes] not found. scala.ScalaReflectionException: class scala.Option in JavaMirror with ClasspathFilter( parent = URLClassLoader with NativeCopyLoader with RawResources( urls = List(/home/user/Projects/sample-app/target/scala-2.11/classes, ..., /home/user/.ivy2/cache/org.apache.parquet/parquet-jackson/jars/parquet-jackson-1.8.1.jar), parent = java.net.URLClassLoader@7c4113ce , resourceMap = Set(app.class.path, boot.class.path), nativeTemp = /tmp/sbt_c2afce ) root = sun.misc.Launcher$AppClassLoader@677327b6 cp = Set(/home/user/.ivy2/cache/org.glassfish.jersey.core/jersey-common/jars/jersey-common-2.22.2.jar, ..., /home/user/.ivy2/cache/net.razorvine/pyrolite/jars/pyrolite-4.13.jar) ) of type class sbt.classpath.ClasspathFilter with classpath [<unknown>] and parent being URLClassLoader with NativeCopyLoader with RawResources( urls = List(/home/user/Projects/sample-app/target/scala-2.11/classes, ..., /home/user/.ivy2/cache/org.apache.parquet/parquet-jackson/jars/parquet-jackson-1.8.1.jar), parent = java.net.URLClassLoader@7c4113ce , resourceMap = Set(app.class.path, boot.class.path), nativeTemp = /tmp/sbt_c2afce ) of type class sbt.classpath.ClasspathUtilities$$anon$1 with classpath [file:/home/user/Projects/sample-app/target/scala-2.11/classes/,.../jre/lib/charsets.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/jfr.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/classes] not found. at scala.reflect.internal.Mirrors$RootsBase.staticClass(Mirrors.scala:123) at scala.reflect.internal.Mirrors$RootsBase.staticClass(Mirrors.scala:22) at org.apache.spark.sql.catalyst.ScalaReflection$$typecreator42$1.apply(ScalaReflection.scala:614) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe$lzycompute(TypeTags.scala:232) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe(TypeTags.scala:232) at org.apache.spark.sql.catalyst.ScalaReflection$class.localTypeOf(ScalaReflection.scala:782) at org.apache.spark.sql.catalyst.ScalaReflection$.localTypeOf(ScalaReflection.scala:39) at org.apache.spark.sql.catalyst.ScalaReflection$.optionOfProductType(ScalaReflection.scala:614) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.apply(ExpressionEncoder.scala:51) at org.apache.spark.sql.Encoders$.scalaInt(Encoders.scala:281) at org.apache.spark.sql.SQLImplicits.newIntEncoder(SQLImplicits.scala:54) at HelloWorld$.main(HelloWorld.scala:9) at HelloWorld.main(HelloWorld.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) [trace] Stack trace suppressed: run last compile:run for the full output. 17/06/01 10:09:15 ERROR ContextCleaner: Error in cleaning thread java.lang.InterruptedException at java.lang.Object.wait(Native Method) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143) at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:181) at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1245) at org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:178) at org.apache.spark.ContextCleaner$$anon$1.run(ContextCleaner.scala:73) 17/06/01 10:09:15 ERROR Utils: uncaught error in thread SparkListenerBus, stopping SparkContext java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) at java.util.concurrent.Semaphore.acquire(Semaphore.java:312) at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(LiveListenerBus.scala:80) at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79) at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:78) at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1245) at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:77) 17/06/01 10:09:15 ERROR Utils: throw uncaught fatal error in thread SparkListenerBus java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) at java.util.concurrent.Semaphore.acquire(Semaphore.java:312) at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(LiveListenerBus.scala:80) at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79) at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:78) at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1245) at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:77) 17/06/01 10:09:15 INFO SparkUI: Stopped Spark web UI at http://127.0.0.1:4040 java.lang.RuntimeException: Nonzero exit code: 1 at scala.sys.package$.error(package.scala:27) [trace] Stack trace suppressed: run last compile:run for the full output. [error] (compile:run) Nonzero exit code: 1 [error] Total time: 7 s, completed 1 Jun, 2017 10:09:15 AM
But when I add fork in run := true to build.sbt , the application works fine
New build.sbt :
name := """sbt-sample-app""" version := "1.0" scalaVersion := "2.11.7" libraryDependencies += "org.scalatest" %% "scalatest" % "2.2.6" % "test" libraryDependencies += "org.apache.spark" % "spark-sql_2.11" % "2.1.1" fork in run := true
Here's the conclusion:
$ sbt run [info] Loading global plugins from /home/user/.sbt/0.13/plugins [info] Loading project definition from /home/user/Projects/sample-app/project [info] Set current project to sbt-sample-app (in build file:/home/user/Projects/sample-app/) [success] Total time: 0 s, completed 1 Jun, 2017 10:15:43 AM [info] Updating {file:/home/user/Projects/sample-app/}sample-app... [info] Resolving jline#jline;2.12.1 ... [info] Done updating. [warn] Scala version was updated by one of library dependencies: [warn] * org.scala-lang:scala-library:(2.11.7, 2.11.0) -> 2.11.8 [warn] To force scalaVersion, add the following: [warn] ivyScala := ivyScala.value map { _.copy(overrideScalaVersion = true) } [warn] Run 'evicted' to see detailed eviction warnings [info] Compiling 1 Scala source to /home/user/Projects/sample-app/target/scala-2.11/classes... [info] Running HelloWorld [error] Using Spark default log4j profile: org/apache/spark/log4j-defaults.properties [error] 17/06/01 10:16:13 INFO SparkContext: Running Spark version 2.1.1 [error] 17/06/01 10:16:13 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable [error] 17/06/01 10:16:14 WARN Utils: Your hostname, user-Vostro-15-3568 resolves to a loopback address: 127.0.1.1; using 127.0.0.1 instead (on interface enp3s0) [error] 17/06/01 10:16:14 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address [error] 17/06/01 10:16:14 INFO SecurityManager: Changing view acls to: user [error] 17/06/01 10:16:14 INFO SecurityManager: Changing modify acls to: user [error] 17/06/01 10:16:14 INFO SecurityManager: Changing view acls groups to: [error] 17/06/01 10:16:14 INFO SecurityManager: Changing modify acls groups to: [error] 17/06/01 10:16:14 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(user); groups with view permissions: Set(); users with modify permissions: Set(user); groups with modify permissions: Set() [error] 17/06/01 10:16:14 INFO Utils: Successfully started service 'sparkDriver' on port 37747. [error] 17/06/01 10:16:14 INFO SparkEnv: Registering MapOutputTracker [error] 17/06/01 10:16:14 INFO SparkEnv: Registering BlockManagerMaster [error] 17/06/01 10:16:14 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information [error] 17/06/01 10:16:14 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up [error] 17/06/01 10:16:14 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-edf40c39-a13e-4930-8e9a-64135bfa9770 [error] 17/06/01 10:16:14 INFO MemoryStore: MemoryStore started with capacity 1405.2 MB [error] 17/06/01 10:16:14 INFO SparkEnv: Registering OutputCommitCoordinator [error] 17/06/01 10:16:14 INFO Utils: Successfully started service 'SparkUI' on port 4040. [error] 17/06/01 10:16:15 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://127.0.0.1:4040 [error] 17/06/01 10:16:15 INFO Executor: Starting executor ID driver on host localhost [error] 17/06/01 10:16:15 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 39113. [error] 17/06/01 10:16:15 INFO NettyBlockTransferService: Server created on 127.0.0.1:39113 [error] 17/06/01 10:16:15 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy [error] 17/06/01 10:16:15 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 127.0.0.1, 39113, None) [error] 17/06/01 10:16:15 INFO BlockManagerMasterEndpoint: Registering block manager 127.0.0.1:39113 with 1405.2 MB RAM, BlockManagerId(driver, 127.0.0.1, 39113, None) [error] 17/06/01 10:16:15 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 127.0.0.1, 39113, None) [error] 17/06/01 10:16:15 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 127.0.0.1, 39113, None) [error] 17/06/01 10:16:15 INFO SharedState: Warehouse path is 'file:/home/user/Projects/sample-app/spark-warehouse/'. [error] 17/06/01 10:16:18 INFO CodeGenerator: Code generated in 395.134683 ms [error] 17/06/01 10:16:19 INFO CodeGenerator: Code generated in 9.077969 ms [error] 17/06/01 10:16:19 INFO CodeGenerator: Code generated in 23.652705 ms [error] 17/06/01 10:16:19 INFO SparkContext: Starting job: foreach at HelloWorld.scala:10 [error] 17/06/01 10:16:19 INFO DAGScheduler: Got job 0 (foreach at HelloWorld.scala:10) with 1 output partitions [error] 17/06/01 10:16:19 INFO DAGScheduler: Final stage: ResultStage 0 (foreach at HelloWorld.scala:10) [error] 17/06/01 10:16:19 INFO DAGScheduler: Parents of final stage: List() [error] 17/06/01 10:16:19 INFO DAGScheduler: Missing parents: List() [error] 17/06/01 10:16:19 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[3] at foreach at HelloWorld.scala:10), which has no missing parents [error] 17/06/01 10:16:20 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 6.3 KB, free 1405.2 MB) [error] 17/06/01 10:16:20 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.3 KB, free 1405.2 MB) [error] 17/06/01 10:16:20 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 127.0.0.1:39113 (size: 3.3 KB, free: 1405.2 MB) [error] 17/06/01 10:16:20 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:996 [error] 17/06/01 10:16:20 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[3] at foreach at HelloWorld.scala:10) [error] 17/06/01 10:16:20 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks [error] 17/06/01 10:16:20 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 6227 bytes) [error] 17/06/01 10:16:20 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) [info] 2 [info] 3 [info] 4 [error] 17/06/01 10:16:20 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1231 bytes result sent to driver [error] 17/06/01 10:16:20 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 152 ms on localhost (executor driver) (1/1) [error] 17/06/01 10:16:20 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool [error] 17/06/01 10:16:20 INFO DAGScheduler: ResultStage 0 (foreach at HelloWorld.scala:10) finished in 0.181 s [error] 17/06/01 10:16:20 INFO DAGScheduler: Job 0 finished: foreach at HelloWorld.scala:10, took 0.596960 s [error] 17/06/01 10:16:20 INFO SparkContext: Invoking stop() from shutdown hook [error] 17/06/01 10:16:20 INFO SparkUI: Stopped Spark web UI at http://127.0.0.1:4040 [error] 17/06/01 10:16:20 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! [error] 17/06/01 10:16:20 INFO MemoryStore: MemoryStore cleared [error] 17/06/01 10:16:20 INFO BlockManager: BlockManager stopped [error] 17/06/01 10:16:20 INFO BlockManagerMaster: BlockManagerMaster stopped [error] 17/06/01 10:16:20 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! [error] 17/06/01 10:16:20 INFO SparkContext: Successfully stopped SparkContext [error] 17/06/01 10:16:20 INFO ShutdownHookManager: Shutdown hook called [error] 17/06/01 10:16:20 INFO ShutdownHookManager: Deleting directory /tmp/spark-77d00e78-9f76-4ab2-bc40-0b99940661ac [success] Total time: 37 s, completed 1 Jun, 2017 10:16:20 AM
Can someone help me understand the reason for this?