I have a dataset with known labels. I want to try clustering and see if I can get the same clusters as the known labels. To measure accuracy, I need to get something like confusion.
I know that I can easily get a confusion matrix for a test suite of classification problems. I have already tried this, like this .
However, it cannot be used for clustering, since it was expected that both columns and rows would have the same set of labels, which makes sense for the classification problem. But for the cluster problem, what I expect is something like this.
Strings - Actual Tags
Columns are the new cluster names (i.e. cluster-1, cluster-2, etc.)
Is there any way to do this?
Change More detailed information.
In sklearn.metrics.confusion_matrix, he expects y_test and y_pred to have the same value, and labels to label these values.
That's why it gives a matrix that has the same labels for both rows and columns like this.

But in my case (KMeans Clustering), the real values ββare strings, and the estimated values ββare numbers (i.e. the cluster number)
Therefore, if I call confusion_matrix(y_true, y_pred) , it gives an error below.
ValueError: Mix of label input types (string and number)
This is a real problem. For the classification problem, this makes sense. But for the clustering task, this restriction should not be, because the real label names and the new cluster names do not have to be the same.
With this, I understand that I am trying to use the tool that is supposed to be used for classification tasks, for the clustering problem. So my question is: is there a way to get such a matrix for clustered data.
Hope the issue has become clearer. Please let me know if this is not the case.