I am running a boo.pyscript in AWS EMR using spark-submit(Spark 2.0).
File completed successfully when I use
python boo.py
However, when executed
spark-submit --verbose --deploy-mode cluster --master yarn boo.py
The log yarn logs -applicationId ID_numberdisplays:
Traceback (most recent call last):
File "boo.py", line 17, in <module>
import boto3
ImportError: No module named boto3
I use pythonand boto3module
$ which python
/usr/bin/python
$ pip install boto3
Requirement already satisfied (use --upgrade to upgrade): boto3 in /usr/local/lib/python2.7/site-packages
How to add this path to the library so spark-submitthat the module can read boto3?
source
share