If we have a set of M words, and we know in advance the similarity of the meaning of each pair of words (we have a matrix of M x M similarities), which algorithm can we use to create one k-dimensional bit vector for each word, so that each pair of words can compare only by comparing their vectors (for example, obtaining the absolute difference of the vectors)?
I do not know what this particular problem is called. If I knew, it would be much easier to find among many algorithms with similar descriptions that do something else.
Additional observation:
I think that this algorithm has to create a side effect in this case. If from the matrix the word A is similar to the word B, and B is similar to C, but a low detection [A, C] is detected, the difference in the calculated result vectors should also lead to a high [A, C] similarity. Thus, we would fill in the previous gaps in the matrix - somehow smooth out the similarities with this algorithm. But besides this smoothing, the goal is to have as close as possible results to the initial numbers that we had in the matrix.
source share