Design for text classification using autocoders

Autocoders can be used to reduce the dimension in the vectors of objects - as I understand it. In textual classification, the feature vector is usually constructed through a dictionary, which tends to be extremely large. I have no experience using autocoders, so my questions are:

  • Can auto encoders be used to reduce dimensionality in text classification? (Why? / Why not?)
  • Has anyone already done this? A source would be good, if so.
+4
source share
1 answer

Existing works use an automatic encoder to create models at the proposal level. Basically, after training a model using Autoencode, you can get a vector for a sentence. Since any document consists of sentences, you can get a set of vectors for the document and classify the documents. In my experience with a different vector representation (for example, generated from autoencodes) this may give worse answers than classification with a bag of words.

+3
source

Source: https://habr.com/ru/post/1544131/


All Articles