Sklearn kmeans equivalent of the elbow method

Let's say I study up to 10 clusters, with scipy I usually generate a “elbow” graph as follows:

from scipy import cluster
cluster_array = [cluster.vq.kmeans(my_matrix, i) for i in range(1,10)]

pyplot.plot([var for (cent,var) in cluster_array])
pyplot.show()

Since then, I was motivated to use sklearn for clustering, but I'm not sure how to create the array needed to build, as in the scipy case. My best guess:

from sklearn.cluster import KMeans

km = [KMeans(n_clusters=i) for i range(1,10)]
cluster_array = [km[i].fit(my_matrix)]

Unfortunately, this led to an invalid command error. What is the best sklearn way to do this?

thank

+4
source share
2 answers

You had some syntax problems in your code. Now they should be fixed:

Ks = range(1, 10)
km = [KMeans(n_clusters=i) for i in Ks]
score = [km[i].fit(my_matrix).score(my_matrix) for i in range(len(km))]

The method fitreturns an object self. In this line in the source code

cluster_array = [km[i].fit(my_matrix)]

cluster_array , km.

score, , . , plot(Ks, score).

+5

Kmeans.

, X - :

from sklearn.cluster import KMeans
from matplotlib import pyplot as plt

X = # <your_data>
distorsions = []
for k in range(2, 20):
    kmeans = KMeans(n_clusters=k)
    kmeans.fit(X)
    distorsions.append(kmeans.inertia_)

fig = plt.figure(figsize=(15, 5))
plt.plot(range(2, 20), distorsions)
plt.grid(True)
plt.title('Elbow curve')
+5

Source: https://habr.com/ru/post/1666204/


All Articles