What is the time complexity of k-means?

I was looking at the k-means Wikipedia page . Based on the algorithm, I think the complexity is O(n*k*i) ( n = total number of elements, k = number of cluster iterations)

So can someone explain this statement from Wikipedia to me and how complicated is this NP?

If k and d (dimension) are fixed, the problem can be exactly solved in time O(n dk+1 log n) , where n is the number of objects to be clustered.

+7
source share
4 answers

It depends on what you call k-means.

The problem of finding the global optimum of the objective function of k-means

enter image description here

NP-hard, where S i is the cluster i (and there are k clusters), x j is the d dimensional point in the cluster S i and ΞΌ i is the centroid (the middle of the points) of the cluster S i .

However, the execution of a fixed number of iterations t standard algorithm takes only O(t*k*n*d) for n ( d -dimensional) points, where k is the number of centroids (or clusters). This is what practical implementations do (often with random restarts between iterations).

The standard algorithm only approximates the local optimum of the above function, like all k-means algorithms that I have seen.

+17
source

In this answer, note that i used k-means in the goal formula, and i used k-means in the time complexity analysis (that is, the number of iterations needed before convergence) are different.

+1
source

The problem is NP-Hard, because there is another well-known complex NP-problem that can be reduced to the (flat) k-means problem. Take a look at the article β€œThe Planar Problem of k-Means” is NP-complicated (by Mahajan et al.) For more information.

+1
source

Finding minima in the objective function of k-means is an NP-Hard problem. Thus, we do not solve this problem, we try to bring it closer, and then solve it using the concept of approximation.

0
source

Source: https://habr.com/ru/post/1500705/


All Articles