I use Python multiprocessing functions to run my code in parallel on a machine with approximately 500 GB of RAM. To share some arrays between different workers, I create an object Array:
N = 150
ndata = 10000
sigma = 3
ddim = 3
shared_data_base = multiprocessing.Array(ctypes.c_double, ndata*N*N*ddim*sigma*sigma)
shared_data = np.ctypeslib.as_array(shared_data_base.get_obj())
shared_data = shared_data.reshape(-1, N, N, ddim*sigma*sigma)
This works fine for sigma=1, but for sigma=3one of the device’s hard drives it slowly fills up until there is no more free space, and then the process ends with this exception:
OSError: [Errno 28] No space left on device
Now I have 2 questions:
- Why does this code even write anything to disk? Why is all this not stored in memory?
- How can I solve this problem? Can I force Python to store it entirely in RAM without writing it to the hard drive? Or can I change the hard drive on which this array is written?
. - , , " ". /dev/shm , /dev/sda1, .
- ( ) strace .
# 2. , . , , multiprocessing ,
process.current_process()._config.get('tempdir')
script
from multiprocessing import process
process.current_process()._config['tempdir'] = '/data/tmp/'
, . , . : - ?