Can Spark code run on a cluster without spark-submit?

I would like to develop a Scala application that connects the wizard and runs the spark part of the code. I would like to achieve this without using spark-submit. Is it possible? In particular, I would like to know if the following code can run from my machine and connect to the cluster:

val conf = new SparkConf()
  .setAppName("Meisam")
  .setMaster("yarn-client")

val sc = new SparkContext(conf)

val sqlContext = new SQLContext(sc)
val df = sqlContext.sql("SELECT * FROM myTable")

...
+4
source share
3 answers

add conf

val conf = new SparkConf() .setAppName("Meisam") .setMaster("yarn-client") .set("spark.driver.host", "127.0.0.1");

+6
source

Yes, it’s possible and basically what you did is all that is needed to complete the tasks performed on the YARN cluster in client deployment mode (where the driver runs on the computer on which the application is running).

spark-submit SparkConf, , , URL-. , Spark Spark- - YARN, Mesos, Spark Standalone local - .

+3

, , , , , Spark, . , , , , - , - , UDF ( , AKA, ). https://issues.apache.org/jira/browse/SPARK-18075 , , , . , ( ): Eclipse Spark

0

Source: https://habr.com/ru/post/1617724/


All Articles