Simulation-like optimization

I am trying to understand this optimized code to find the cosine similarity between the user matrix.

def fast_similarity(ratings,epsilon=1e-9):
    # epsilon -> small number for handling dived-by-zero errors
    sim = ratings.T.dot(ratings) + epsilon
    norms = np.array([np.sqrt(np.diagonal(sim))])
    return (sim / norms / norms.T)

If ratings =

           items           
     u  [
     s    [1,2,3]
     e    [4,5,6]
     r    [7,8,9] 
     s  ]

nomrs will be equal = [1 ^ 2 + 5 ^ 2 + 9 ^ 2]

but why do we write sim / norms / norms .T to calculate the cosine similarity? Any help is appreciated.

+4
source share
1 answer

Looking through the code, we have the following:

first

And this means that on one diagonal of the matrix simwe get the result of multiplying each column.

You can try if you want to use a simple matrix:

second

, - ( ) .

norms, , , gram matrix sqrt .

, :

third

, norms result.

, , , :

forth

, : fifth

, , :

six

, , :

seven

, :

return sim / norms / norms.T

EDIT: , , , , DOT PRODUCT .

, A * B, A.T * B

+3

Source: https://habr.com/ru/post/1673474/


All Articles