K-tool does not minimize distances.
It minimizes quadratic errors , which is completely different. The difference is approximately the average, and the average is in one-dimensional data. The error can be massive.
Here is an example counter, assuming we have coordinates:
-1 0 +1 0 0 -1 0 101
The center chosen by k-means will be 0.25. The optimal location is 0.0. The sum of the distances by k-value is> 152, the optimal location has a distance of 104. Thus, the center of gravity is almost 50% worse than optimal! But the centroid (= multivariate mean) is what the k-tool uses!
k-tool does not minimize Euclidean distance!
This is one of the options, as the "k-tool is sensitive to emissions."
It doesnโt work if you try to limit it to placing โcentersโ only on the coast ...
In addition, you can at least use the Haversin distance, because in California 1 degree north! = 1 degree east, because it is not at the equator.
In addition, you probably should not make the assumption that each location requires its own pipe, but they will be connected like a tree. This significantly reduces the cost.
I highly recommend treating this as a general optimization problem, not k-means. K-tool is also an optimization, but may optimize the wrong function for your problem ...
source share