Pyspark ImportError: Unable to import namestores

Question

Pyspark ImportError: Unable to import namestores

Goal. I am trying to get pyspark apache-spark to be interpreted correctly in my pycharm development environment.

Problem: I am currently getting the following error:

ImportError: cannot import name accumulators

I followed the next blog to help me in this process. http://renien.imtqy.com/blog/accessing-pyspark-pycharm/

Due to the fact that my code took the exclusive path, I personally got rid of the attempt: with the exception of: just to find out what the exact error is.

Before that I got the following error:

 ImportError: No module named py4j.java_gateway

This was fixed simply by entering '$ sudo pip install py4j' in bash.

Currently my code is as follows:

 import os import sys # Path for spark source folder os.environ['SPARK_HOME']="[MY_HOME_DIR]/spark-1.2.0" # Append pyspark to Python Path sys.path.append("[MY_HOME_DIR]/spark-1.2.0/python/") try: from pyspark import SparkContext print ("Successfully imported Spark Modules") except ImportError as e: print ("Can not import Spark Modules", e) sys.exit(1)

My questions:
1. What is the cause of this error? What is the reason? 2. How to fix the problem so that I can run pyspark in my pycharm editor.

NOTE. The current interpreter I'm using in pycharm is Python 2.7.8 (~ / anaconda / bin / python)

Thanks in advance!

Don

+5

python pycharm apache-spark

Donald vetal Dec 22 '14 at 21:02

source share

11 answers

It is located around the PYTHONPATH variable, which defines the search path of the python module.

The reason is that pyspark works well, you can access the shell script pyspark, and the PYTHONPATH parameter looks like the one shown below.

PYTHONPATH = / USR / Library / spark / Python / Library / py4j-0.8.2.1-src.zip.: / USR / Library / spark / Python

My environment is Cloudera Qickstart VM 5.3.

Hope this helps.

+7

ben.ko Jan 14 '15 at 4:45

source share

It looks like a circular dependency error.

In MY_HOME_DIR]/spark-1.2.0/python/pyspark/context.py delete or comment out the line

from pyspark import accumulators .

These are approximately 6 lines of code above.

Here I asked a problem with the Spark project:

https://issues.apache.org/jira/browse/SPARK-4974

+4

matt2000 Dec 27 '14 at 0:18

source share

I came across the same error. I just installed py4j.

 sudo pip install py4j

No need to install bashrc.

+2

Ale Dec 25 '16 at 8:38

source share

I ran into the same problem using cdh 5.3

in the end, it turned out to be pretty easy to solve. I noticed that script / usr / lib / spark / bin / pyspark has variables defined for ipython

I installed anaconda in / opt / anaconda

 export PATH=/opt/anaconda/bin:$PATH #note that the default port 8888 is already in use so I used a different port export IPYTHON_OPTS="notebook --notebook-dir=/home/cloudera/ipython-notebook --pylab inline --ip=* --port=9999"

then finally ....

completed

 /usr/bin/pyspark

which is now functioning properly.

+1

user1136149 Jan 01 '15 at 6:25

source share

I ran into this problem. To solve this problem, I commented out line 28 in ~/spark/spark/python/pyspark/context.py , the file that caused the error:

 # from pyspark import accumulators from pyspark.accumulators import Accumulator

Since importing the battery seems to be covered by the next line (29), it seems that the problem does not arise. The spark is working fine now (after pip install py4j ).

+1

Razi shaban Feb 08 '16 at 16:11

source share

In Pycharm, before running on a script, make sure you unzip the py4j * .zip file. and add its link to the sys.path.append script ("spark path * / python / lib")

It worked for me.

+1

shubham gorde Jul 28 '16 at 10:36

source share

 To get rid of **ImportError: No module named py4j.java_gateway** you need to add following lines import os import sys os.environ['SPARK_HOME'] = "D:\python\spark-1.4.1-bin-hadoop2.4" sys.path.append("D:\python\spark-1.4.1-bin-hadoop2.4\python") sys.path.append("D:\python\spark-1.4.1-bin-hadoop2.4\python\lib\py4j-0.8.2.1-src.zip") try: from pyspark import SparkContext from pyspark import SparkConf print ("success") except ImportError as e: print ("error importing spark modules", e) sys.exit(1)

+1

user225710 Sep 11 '16 at 23:05

source share

I was able to find a fix for this on Windows, but am not quite sure of its root cause.

If you open the accumulators.py file, you will see that there is first a comment on the header, then help text, and then import instructions. move one or more import statements immediately after the comment block and before the help text. This worked on my system and I was able to import pyspark without any problems.

0

Murali Mar 18 '16 at 23:08

source share

If you have just updated the new intrinsic safety version, make sure that the new py4j version is in your PATH, as each new spark version comes with a new py4j version.

In my case it is: "$ SPARK_HOME / python / lib / py4j-0.10.3-src.zip" for spark 2.0.1 instead of the old "$ SPARK_HOME / python / lib / py4j-0.10.1- src.zip" for spark 2.0.0

0

architectonic Oct 21 '16 at 10:46

source share

The only thing that worked for me was to go to the spark's base folder. then go to accumulators.py file

In the beginning, the wrong command with multiple lines was used. delete everything.

you are good to go!

0

Hari krishnan Jul 08 '17 at 12:17

source share

Shuai.z · Accepted Answer · 2016-11-10T03:32:47+0000

First, set up your var environment

 export SPARK_HOME=/home/.../Spark/spark-2.0.1-bin-hadoop2.7 export PYTHONPATH=$SPARK_HOME/python/:$SPARK_HOME/python/lib/py4j-0.10.3-src.zip:$PYTHONPATH PATH="$PATH:$JAVA_HOME/bin:$SPARK_HOME/bin:$PYTHONPATH"

make sure you use your own version name

and then restart! It is important to verify the installation.

Pyspark ImportError: Unable to import namestores

More articles: