I use Python 2.7 and NumPy 1.11.2, as well as the latest versions of dill (I just made pip install dill), on Ubuntu 16.04.
When storing a NumPy array using pickle, I find that the brine is very slow and stores arrays almost three times the “required” size.
For example, in the following code, pickle is about 50 times slower (1 s versus 50) and creates a 2.2 GB file instead of 800 MB.
import numpy
import pickle
import dill
B=numpy.random.rand(10000,10000)
with open('dill','wb') as fp:
dill.dump(B,fp)
with open('pickle','wb') as fp:
pickle.dump(B,fp)
I thought dill was just a wrapper around the pickle. If this is true, is there a way I can improve pickling performance on my own? Is it generally not recommended to use a brine for NumPy arrays?
EDIT: using Python3, I get the same performance for pickleanddill
PS: numpy.save, , , , , .