Using spark core on Jupyter

So I'm just starting out with Jupyter and the idea of ​​laptops.

I usually program in VIM and terminal, so I'm still trying to figure something out.

I am trying to use the core of Toree.

I am trying to establish a core that can perform a spark and meet Tori. I installed toree and appears when I run the kernel list. Here is the result:

$ jupyter kernelspec list Available kernels: python3 C:\Users\UserName\AppData\Local\Continuum\Anaconda3\lib\site-packages\ipykernel\resources bash C:\Users\UserName\AppData\Roaming\jupyter\kernels\bash toree C:\ProgramData\jupyter\kernels\toree


Therefore, when I open the laptop for the laptop, the kernel dies and does not reboot. Closing the laptop and reopening it will replace the kernel with Python3.

A large error message appears that is printed to the host terminal and a laptop error message. There is another message that has been suspended; they are the same error messages.

I completed this page for installation: https://github.com/apache/incubator-toree

These instructions will appear mainly for Linux / Mac.

Any thoughts on how to get a spark pad on Jupyter?

I understand that there is not much information if more is required. Let me know.

+5
source share
2 answers

I sent a similar question to Hitter, and they answered by saying (to rephrase) that:

Toree is the future of spark programming on Jupyter and seems to be installed correctly on a Windows machine, but .jar and .sh files will not work correctly on a Windows machine.

Knowing this, I tried it on my Linux (Fedora) and borrowed Mac. Once jupyter was installed (and Anaconda), I entered the following commands:

 $ SparkHome="~/spark/spark1.5.5-bin.hadoop2.6" $ sudo pip install toree Password: ********** $ sudo jupyter toree install --spark_home=$SparkHome 

Jupyter launched the laptop on both machines. I believe that VM can work too. I want to see if the Window 10 bash shell will work with this as I launch windows 7.

Thanks for the other Documents!

+3
source

The answer from @ user3025281 solved the problem for me. I had to do the following setup for my environment (Ubuntu 16.04 distribution for Linux, running on Spark 2.2.0 and Hadoop 2.7). Download is a direct download of files from hosting sites or a mirror site.

Basically you will configure environment variables and then call jupyter, assuming it was set via anaconda. which pretty much it

 SPARK_HOME="~/spark/spark-2.2.0-bin-hadoop2.7" 

Write this to your ~/.bashrc file, and then call the source on .bashrc

 # reload environment variables source ~/.bashrc` 

Install

 sudo pip install toree sudo jupyter toree install --spark_home=$SPARK_HOME 

And now ... we are gucci

In addition . On Windows 10, you can use "Bash on Ubuntu on Windows" to configure jupyter on a Linux distribution

0
source

Source: https://habr.com/ru/post/1246012/


All Articles