Operations on huge dense matrices in numpy

To train a neural network at some point, I have a huge matrix of 212,243 × 2,500 phi in size and vectors y (212243) and w (2500), which are stored as numpy arrays of twins. What I'm trying to figure out

 w = dot(pinv(phi), y) # serialize w... r = dot(w, transpose(phi)) # serialize r... 

My machine has 6 GB of RAM and 16 GB of swap on Ubuntu x64. I started calculating two and two times when I ended up with system swap errors (not python) after about an hour of work.

Is there a way to do this calculation on my computer? This does not need to be done with python.

+4
source share
2 answers

If you don't need a pseudo-inverse for anything else but to calculate w , replace this line as follows:

 w = np.linalg.lstsq(phi, y)[0] 

On my system, it runs about 2 times faster and uses about half of the intermediate storage.

+3
source

We will see:

 212,243 row values * 2500 col values * 8 bytes/value = 4,244,860,000 bytes = 4GB 

To save all memory in memory, you will need memory.

If it were Java, I would recommend that you increase the maximum heap on your JVM. I don't know what the analogy is for Python.

0
source

Source: https://habr.com/ru/post/1479985/


All Articles