Findpark.init () IndexError: index index error out of range

when running the next in a Python 3.5 Jupyter environment, I get the error below. Any ideas on what causes this?

import findspark findspark.init() 

Error:

 IndexError Traceback (most recent call last) <ipython-input-20-2ad2c7679ebc> in <module>() 1 import findspark ----> 2 findspark.init() 3 4 import pyspark /.../anaconda/envs/pyspark/lib/python3.5/site-packages/findspark.py in init(spark_home, python_path, edit_rc, edit_profile) 132 # add pyspark to sys.path 133 spark_python = os.path.join(spark_home, 'python') --> 134 py4j = glob(os.path.join(spark_python, 'lib', 'py4j-*.zip'))[0] 135 sys.path[:0] = [spark_python, py4j] 136 IndexError: list index out of range 
+5
source share
3 answers

This is most likely due to the fact that the SPARK_HOME environment SPARK_HOME not set correctly on your system. Alternatively, you can simply specify it when you initialize findspark , for example:

 import findspark findspark.init('/path/to/spark/home') 

After that, everything should work!

+2
source

I was getting the same error and was able to get it working by entering the exact installation directory:

 import findspark # Use this findspark.init("C:\Users\PolestarEmployee\spark-1.6.3-bin-hadoop2.6") # Test from pyspark import SparkContext, SparkConf 

Basically, this is the directory in which the spark was extracted. In the future, when you see spark_home , enter the same installation directory. I also tried using the torus to create a kernel instead, but it somehow doesn't work. A kernel would be a cleaner solution.

+2
source

You need to update the SPARK_HOME variable in the bash_profile file. For me, the following command worked (in the terminal):

export SPARK_HOME="/usr/local/Cellar/apache-spark/2.2.0/libexec/"

After that, you can use the following commands:

 import findspark findspark.init('/usr/local/Cellar/apache-spark/2.2.0/libexec') 
0
source

Source: https://habr.com/ru/post/1264198/


All Articles