Well ... I have to say that the classification of documents is different, what you guys think about.
As a rule, when classifying documents after preprocessing, the test data is always extremely large, for example, O (N ^ 2) ... Therefore, it can be too expensive computational.
Another typical classifier that comes to my mind is the discriminant classifier ... which does not need a generative model for your dataset. After training, you need to do to put your only record in the algorithm, and it will be classified.
Good luck with that. For example, you can check the book of E. Alpadin "Introduction to machine learning."
source share