From the question, it seems that the structure is a grid where each user is connected to the others (500K X (500k -1)). It sounds very complicated. Having made some heuristic assumptions, optimizations may be possible.
Supposed case 1: not every pair of users can have weight, this can lead to a sparse matrix. So why not save only non-zero weights
Suspected Case 2: I have a strong feeling that the range of weights may be limited. I don’t think there will be 500 thousand different weights, perhaps 500 different weights. If so, create 500 different groups under which user pairs are stored. Not much space saving, but a partitioning method.
To save space by using case 2, eliminate the need to store users in these groups. The set of characteristics of interest (lower bound and upper bound). To get a match for this user, follow these steps:
- Go through 500 groups with odd weights and select the most suitable lower and upper bounds. You will not know the exact user, but now you know how he / she displays.
- Search user table for users who fall into this border
- Perform a more detailed analysis of the actual user group returned in step 2.
My assumptions may be wrong. In this case, I just gave a friend.
source share