I think the problem here is that you are approaching your model with incorrect data
True result:
ward.labels_ >> array([1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 2, 0, 0, 2, 1])
What is the label of each element in the vector X and does not make sensation
If I understand your problem well, you need to classify your users by the distance between them (similarity). Well, in this case, I suggest using spectral clustering as follows:
import numpy as np from sklearn.cluster import SpectralClustering lena = np.matrix('1 1 0 0;1 1 0 0;0 0 1 0.2;0 0 0.2 1') n_clusters = 3 SpectralClustering(n_clusters).fit_predict(lena) >> array([1, 1, 0, 2], dtype=int32)
source share