Java: how to efficiently store sparse data

I have over 1 billion items with approximately 1000 columns (matrix). But for 95% of the columns, the ratio of unique values ​​is less than a percent, so this data can be classified as sparse data.

What is an effective and turnkey solution for storing such data in Java?

+5
source share
2 answers

Not sure if you get it. If you really have billions of rows, even if you find a mechanism for efficiently storing your sparse matrix, you may have problems storing large amounts of data in memory.

However, you can use a simple map whose key is Pair , which contains a row and column to bind.

 public class Pair<P, Q> { public final P p; public final Q q; public Pair(P p, Q q) { this.p = p; this.q = q; } // TODO: Implement equals and hashCode. } class Datum { } // My sparse database. Map<Pair<Integer, Integer>, Datum> data = new HashMap<>(); 

This will be used close to minimal storage, but does not necessarily solve your problem.

+1
source

Well, I think a HashTable would be a better option for this ... a key-value pair is effective for the same value , i.e. a single key for multiple value s.

0
source

Source: https://habr.com/ru/post/1207573/


All Articles