How do you train Scikit LinearSVC in a dataset that is too large or impractical to fit into memory? I try to use it to classify documents, and I have several thousand marked examples of records, but when I try to load all this text into memory and train LinearSVC, it consumes more than 65% of my memory and I have to kill it before mine the system will stop responding.
Is it possible to format my training data as a single file and transfer it to LinearSVC with the file name instead of calling the fit() method?
I found this guide, but it really does cover the classification and assumes that the training is gradual, something is not supported by LinearSVC.
Cerin source share