Psutil in Apache Spark

I am using PySpark 1.5.2. I got UserWarning Please install psutil to have better support with spilling after issuing the .collect() command

Why was this warning shown?

How to install psutil ?

+5
source share
2 answers
 pip install psutil 

If you need to install specifically for python 2 or 3, try using pip2 or pip3 ; It works for both major versions. Here is the PyPI package for psutil.

+9
source

y can clone or download the psutil project at the following link: https://github.com/giampaolo/psutil.git

then run setup.py to install psutil

in 'spark / python / pyspark / shuffle.py' y you can see the following codes:

 def get_used_memory(): """ Return the used memory in MB """ if platform.system() == 'Linux': for line in open('/proc/self/status'): if line.startswith('VmRSS:'): return int(line.split()[1]) >> 10 else: warnings.warn("Please install psutil to have better " "support with spilling")** if platform.system() == "Darwin": import resource rss = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss return rss >> 20 # TODO: support windows return 0 

therefore, I assume yr os is not linux, so psutil is suggested.

+1
source

Source: https://habr.com/ru/post/1239384/


All Articles