When I try to save a very large array (20,000 x 20,000 elements), I return all zeros:
In [2]: shape = (2e4,)*2 In [3]: r = np.random.randint(0, 10, shape) In [4]: r.tofile('r.data') In [5]: ls -lh r.data -rw-r--r-- 1 whg staff 3.0G 23 Jul 16:18 r.data In [6]: r[:6,:6] Out[6]: array([[6, 9, 8, 7, 4, 4], [5, 9, 5, 0, 9, 4], [6, 0, 9, 5, 7, 6], [4, 0, 8, 8, 4, 7], [8, 3, 3, 8, 7, 9], [5, 6, 1, 3, 1, 4]]) In [7]: r = np.fromfile('r.data', dtype=np.int64) In [8]: r = r.reshape(shape) In [9]: r[:6,:6] Out[9]: array([[0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0]])
np.save () does similar strange things.
After searching the network, I found that there is a known error in OSX:
https://github.com/numpy/numpy/issues/2806
When I try to read tostring () data from a file using Python read (), I get a memory error.
Is there a better way to do this? Can anyone recommend a pragmatic workaround?