Define Clusters in SOM (Self Organizing Map)

Once I have collected and sorted the data in the SOM, how can I identify the clusters?

(Elements are aggregated and grouped using many attributes - more than 10)

In particular, I want to find the "center" of the cluster, so I give me the "center" of node (s).

+4
source share
3 answers

You can use a relative small map and consider each cluster node, but this is far from optimal. If you want to use the automatic cluster detection method, you should definitely read

Clustering a self-organizing card

and search for a similar bibliography.

You can also use more complex versions of the SOM algorithm (layered, self-developing, etc.).

In any case, keep in mind that the problem of finding the β€œright” number of clusters does not have a final solution.

+5
source

As far as I can tell, SOM is basically a method for managing dimensionality and data compression . That way, it will not cluster data for you; it may actually be inclined toward the propagation of clusters in the projection (i.e., to break them into several cells).

However, for some datasets this may work well:

  • Instead of processing the complete data set, work only on SOM nodes (weighted by the number of assigned elements), which should be significantly less
  • Instead of working in the source space, work in the lower space that the SOM represents

And then run the normal transform data clustering algorithm.

+3
source

Although the old question I encountered the same problem, and I had some success by implementing Estimation of the number of clusters in multidimensional data using self-organizing maps , so I thought I would share it.

The linked algorithm uses a U-matrix to highlight the boundaries of individual clusters, and then uses an image processing algorithm called a watershed to identify components. For this to work correctly, the regions in the u-matrix must be concave within the resolution of your quantization (which, when converted to a binary image, simply leads to the use of a fill to identify areas).

+2
source

Source: https://habr.com/ru/post/1442141/


All Articles