Are K-Medoids really better at controlling emissions than K-Means? (with an example showing the opposite)

K-Medoids and K-Means are two popular partition clustering methods. My research shows that K-Medoids better clusters data when there are outliers ( source ). This is because he selects data points as cluster centers (and uses Manhattan distance), while K-Means selects any center that minimizes the sum of squares, so he is more affected by outliers.

It makes sense, however, when I use these methods for a simple test on compiled data, it does not suggest that using Medoids is better at managing emissions, in fact it is sometimes worse. My question is: where in the next test did I do wrong? Perhaps I have a fundamental misunderstanding of these methods.

Demo: (see here for photos) First, some compiled data (called "comp") that make up 3 obvious clusters

x <- c(2, 3, 2.4, 1.9, 1.6, 2.3, 1.8, 5, 6, 5, 5.8, 6.1, 5.5, 7.2, 7.5, 8, 7.2, 7.8, 7.3, 6.4)
y <- c(3, 2, 3.1, 2.6, 2.7, 2.9, 2.5, 7, 7, 6.5, 6.4, 6.9, 6.5, 7.5, 7.25, 7, 7.8, 7.5, 8.1, 7)

data.frame(x,y) -> comp

library(ggplot2)
ggplot(comp, aes(x, y)) + geom_point(alpha=.5, size=3, pch = 16)

enter image description here

It is bundled with the vegclust package, which can do both K-Means and K-Medoids.

library(vegclust)
k <- vegclust(x=comp, mobileCenters=3, method="KM", nstart=100, iter.max=1000) #K-Means
k <- vegclust(x=comp, mobileCenters=3, method="KMdd", nstart=100, iter.max=1000) #K-Medoids

When creating a scatterplot, both K-Means and K-Medoids collect 3 obvious clusters.

color <- k$memb[,1]+k$memb[,2]*2+k$memb[,3]*3 # Making the different clusters have different colors

# K-Means scatterplot
ggplot(comp, aes(x, y)) + geom_point(alpha=.5, color=color, pch = 16, size=3)

# K-Medoids scatterplot
ggplot(comp, aes(x, y)) + geom_point(alpha=.5, color=color, size=3, pch = 16)

See * Figure 2 * in the link

Now added outlier:

comp[21,1] <- 3
comp[21,2] <- 7.5

This outlier shifts the center of the blue cluster to the left of the graph.

K- .

See * Figure 3 * in the link

, K- ( ) ( , ), K-Medoids .

See * Figure 4 * in the link

, K-Means , K-Medoids ( , ..). - , ?

+4

Source: https://habr.com/ru/post/1616996/


All Articles