How to run a script in PySpark

I am trying to run a script in a pyspark environment, but so far I have not been able to. How can I run a script like python script.py but in pyspark? Thanks

+5
source share
3 answers

You can do: ./bin/spark-submit mypythonfile.py

Running python applications through pyspark not supported in Spark 2.0.

+10
source

pyspark 2.0 and later execute the script file in the environment variable PYTHONSTARTUP , so you can run:

 PYTHONSTARTUP=code.py pyspark 

Compared to spark-submit answer, this is useful for running initialization code before using the pyspark interactive shell.

+4
source

Just spark-submit mypythonfile.py should be enough.

0
source

Source: https://habr.com/ru/post/1258174/


All Articles