How to express the similarity of cosines ( http://en.wikipedia.org/wiki/Cosine_similarity )
when is one of the vectors all zeros?
v1 = [1, 1, 1, 1, 1]
v2 = [0, 0, 0, 0, 0]
When we calculate by the classical formula, we get the division by zero:
Let d1 = 0 0 0 0 0 0 Let d2 = 1 1 1 1 1 1 Cosine Similarity (d1, d2) = dot(d1, d2) / ||d1|| ||d2||dot(d1, d2) = (0)*(1) + (0)*(1) + (0)*(1) + (0)*(1) + (0)*(1) + (0)*(1) = 0 ||d1|| = sqrt((0)^2 + (0)^2 + (0)^2 + (0)^2 + (0)^2 + (0)^2) = 0 ||d2|| = sqrt((1)^2 + (1)^2 + (1)^2 + (1)^2 + (1)^2 + (1)^2) = 2.44948974278 Cosine Similarity (d1, d2) = 0 / (0) * (2.44948974278) = 0 / 0
I want to use this measure of similarity in a clustering application. And I often have to compare such vectors. Also [0, 0, 0, 0, 0] versus [0, 0, 0, 0, 0]
Do you have any experience? Since this is a measure of similarity (not distance), I have to use a special case for
d ([1, 1, 1, 1, 1]; [0, 0, 0, 0, 0]) = 0
d ([0, 0, 0, 0, 0]; [0, 0, 0, 0, 0]) = 1
What about
d ([1, 1, 1, 0, 0]; [0, 0, 0, 0, 0]) =? and etc.