Spread affinity (sklearn) - strange behavior

Attempting to use affinity propagation for a simple clustering task:

from sklearn.cluster import AffinityPropagation c = [[0], [0], [0], [0], [0], [0], [0], [0]] af = AffinityPropagation (affinity = 'euclidean').fit (c) print (af.labels_) 

I get this strange result: [0 1 0 1 2 1 1 0]

I would expect all the samples to be in the same cluster, as in this case:

 c = [[0], [0], [0]] af = AffinityPropagation (affinity = 'euclidean').fit (c) print (af.labels_) 

which really puts all the samples in the same cluster: [0 0 0]

What am I missing?

thanks

+6
source share
1 answer

I believe that this is due to the fact that your problem is essentially incorrect (you pass a lot of the same point to an algorithm that tries to find similarities between different points). AffinityPropagation does the matrix math under the hood, and your similarity matrix (which is all zeros) is shockingly degenerate. In order not to fail, the implementation adds a small random matrix to the similarity matrix, preventing the algorithm from failing when it collides with two identical points.

+3
source

Source: https://habr.com/ru/post/989090/


All Articles