Milion square matrix for quick access

I have very large matrices (say, of the order of millions of rows) that I cannot store in memory, and I will need to access a subsample of this matrix during descent (less than a minute ...). I started looking at hdf5 and crying in combination with numpy and pandas:

But I found this a bit complicated, and I'm not sure if this is the best solution.

Are there other solutions?

thank

EDIT

Here are a few more specifications about the data types I'm dealing with.

  • Matrices are usually sparse (<10% or <25% of cells with non-zero).
  • The matrices are symmetric

And what I need to do:

  • Read-only access
  • ( , )
+4
2

PyTables? . SO .

+1

; hdf5, , , , . , . ; , , , .

0

Source: https://habr.com/ru/post/1629888/


All Articles