Can I use autoencoder for clustering?

In the code below, they use autoencoder as a controlled clustering or classification because they have data labels. http://amunategui.imtqy.com/anomaly-detection-h2o/ But can I use autoencoder for cluster data if I did not have its labels? Relations

+5
source share
2 answers

A deep learning autocoder is always uncontrolled learning. The β€œcontrolled” part of the article you are linking to is an assessment of how well it was.

The following example (taken from chapter 7 of my book "Practical Machine Learning Using H2O", where I try to use all the uncontrolled H2O algorithms in one data set - please, excuse me, is connected) uses 563 functions and tries to encode them into two hidden nodes.

m <- h2o.deeplearning( 2:564, training_frame = tfidf, hidden = c(2), auto-encoder = T, activation = "Tanh" ) f <- h2o.deepfeatures(m, tfidf, layer = 1) 

The second command retrieves hidden node scales. f is a data frame with two numeric columns and one row for each row in the original tfidf data. I selected only two hidden nodes so that I can build clusters:

simplest autostart output

Results will vary with each run. You can (maybe) get better results with complex auto-encoders or use more hidden nodes (but then you cannot draw them). Here, I felt that the results were limited by data.

By the way, I made the above chart using this code:

 d <- as.matrix(f[1:30,]) #Just first 30, to avoid over-cluttering labels <- as.vector(tfidf[1:30, 1]) plot(d, pch = 17) #Triangle text(d, labels, pos = 3) #pos=3 means above 

(PS Source data taken from Brandon Rose is an excellent article on using NLTK .)

+4
source

In some aspects, the encoding data and the clustering data have some overlapping theory. As a result, you can use Autoencoders to group (encode) data.

A simple example for visualization is the availability of a training dataset, which you suspect has two main classes. Such as voter history data for Republicans and Democrats. If you take Autoencoder and encode it into two dimensions, then draw it on the scatter chart, this clustering will become more clear. Below is an example from one of my models. You can see a noticeable separation between the two classes, as well as a little expected overlap. Cluster Voter Data

Code can be found here.

This method does not require only two binary classes, you can also train as many different classes as you wish. Two polarized classes are simply easier to visualize.

This method is not limited to two output measurements, which were intended only for the convenience of construction. In fact, it may seem difficult for you to map certain large measurement spaces to such a small space.

In cases where the encoded (cluster) level is larger in size, it is not so clear to "visualize" function clusters. It gets a little more complicated here, as you will have to use some form of supervised learning to match the encoded (cluster) functions with your instructional tags.

A couple of ways to determine which classes apply is to pump data into the knn clustering algorithm. Or, what I prefer to do is take the encoded vectors and transfer them to the standard reverse error propagation neural network. Note that depending on your data, you may find that simply transferring data directly to your neural network with reverse prevalence is enough.

0
source

Source: https://habr.com/ru/post/1260257/


All Articles