In some aspects, the encoding data and the clustering data have some overlapping theory. As a result, you can use Autoencoders to group (encode) data.
A simple example for visualization is the availability of a training dataset, which you suspect has two main classes. Such as voter history data for Republicans and Democrats. If you take Autoencoder and encode it into two dimensions, then draw it on the scatter chart, this clustering will become more clear. Below is an example from one of my models. You can see a noticeable separation between the two classes, as well as a little expected overlap. 
Code can be found here.
This method does not require only two binary classes, you can also train as many different classes as you wish. Two polarized classes are simply easier to visualize.
This method is not limited to two output measurements, which were intended only for the convenience of construction. In fact, it may seem difficult for you to map certain large measurement spaces to such a small space.
In cases where the encoded (cluster) level is larger in size, it is not so clear to "visualize" function clusters. It gets a little more complicated here, as you will have to use some form of supervised learning to match the encoded (cluster) functions with your instructional tags.
A couple of ways to determine which classes apply is to pump data into the knn clustering algorithm. Or, what I prefer to do is take the encoded vectors and transfer them to the standard reverse error propagation neural network. Note that depending on your data, you may find that simply transferring data directly to your neural network with reverse prevalence is enough.
source share