I just started exploring a random forest, so if that sounds stupid, I really regret it
I recently practiced the description of the word: kaggle , I want to clarify a few things:
using vectorizer.fit_transform ("* in the list of cleaned reviews *))
Now, when we were preparing an array of word sums in train reviews, we used fit_predict in the train review list, now I know that fit_predict does two things:> first it is suitable for data and knows the dictionary , and then it makes vectors for each review.
so when we used vectorizer.transform (the "list of cleared train reviews") , it simply converts the list of test reviews into a vector for each review.
My question is: why not use fit_transform in the test list too !! I mean, in the documents, he says that this leads to retraining, but wait, it makes sense for me to use it anyway, let me give you my perspectives:
when we do not use fit_transform, we are essentially talking to make a vector of test review functions using the most common train recall words !! Why not make an array of test functions using the most common words in a test?
, ? , , test .
: , , , , , , .