What is the time complexity of k-means?

Question

What is the time complexity of k-means?

I was looking at the k-means Wikipedia page . Based on the algorithm, I think the complexity is O(n*k*i) ( n = total number of elements, k = number of cluster iterations)

So can someone explain this statement from Wikipedia to me and how complicated is this NP?

If k and d (dimension) are fixed, the problem can be exactly solved in time O(n ^dk+1 log n) , where n is the number of objects to be clustered.

+7

algorithm time-complexity k-means

parallel Sep 05 '13 at 10:41

source share

4 answers

In this answer, note that i used k-means in the goal formula, and i used k-means in the time complexity analysis (that is, the number of iterations needed before convergence) are different.

+1

masec Nov 22 '15 at 2:27

source share

The problem is NP-Hard, because there is another well-known complex NP-problem that can be reduced to the (flat) k-means problem. Take a look at the article “The Planar Problem of k-Means” is NP-complicated (by Mahajan et al.) For more information.

+1

Dheeraj m pai Oct 29 '18 at 13:58

source share

Finding minima in the objective function of k-means is an NP-Hard problem. Thus, we do not solve this problem, we try to bring it closer, and then solve it using the concept of approximation.

0

Aditya Jun 04 '19 at 8:57

source share

Fred foo · Accepted Answer · 2013-09-05T11:00:39+0000

It depends on what you call k-means.

The problem of finding the global optimum of the objective function of k-means

enter image description here

NP-hard, where S _i is the cluster i (and there are k clusters), x _j is the d dimensional point in the cluster S _i and μ _i is the centroid (the middle of the points) of the cluster S _i .

However, the execution of a fixed number of iterations t standard algorithm takes only O(t*k*n*d) for n ( d -dimensional) points, where k is the number of centroids (or clusters). This is what practical implementations do (often with random restarts between iterations).

The standard algorithm only approximates the local optimum of the above function, like all k-means algorithms that I have seen.

What is the time complexity of k-means?

More articles: