As already noted, for SVM classifiers (as y == np.int* ) preprocessing is mandatory , otherwise the ML-Estimator prediction ability is lost directly under the influence of distorted functions on the division function.
How objected processing time:
- try to better understand what your AI / ML model Overfit / Generalization
[C,gamma] landscape represents - try adding verbosity to the initial setup of the AI ββ/ ML process
- try adding n_jobs to the number of crunches
- try adding grid computing to your computing approach if scale requires
.
aGrid = aML_GS.GridSearchCV( aClassifierOBJECT, param_grid = aGrid_of_parameters, cv = cv, n_jobs = n_JobsOnMultiCpuCores, verbose = 5 )
Sometimes GridSearchCV() can actually use a huge amount of CPU-time / CPU-poolOfRESOURCEs, even after using all of the above tips.
So, be calm and don't panic if you are sure that the preprocessing of Feature-Engineering, data-sanity, and FeatureDOMAIN was done correctly.
[GridSearchCV] ................ C=16777216.0, gamma=0.5, score=0.761619 -62.7min [GridSearchCV] C=16777216.0, gamma=0.5 ......................................... [GridSearchCV] ................ C=16777216.0, gamma=0.5, score=0.792793 -64.4min [GridSearchCV] C=16777216.0, gamma=1.0 ......................................... [GridSearchCV] ............... C=16777216.0, gamma=1.0, score=0.793103 -116.4min [GridSearchCV] C=16777216.0, gamma=1.0 ......................................... [GridSearchCV] ............... C=16777216.0, gamma=1.0, score=0.794603 -205.4min [GridSearchCV] C=16777216.0, gamma=1.0 ......................................... [GridSearchCV] ............... C=16777216.0, gamma=1.0, score=0.771772 -200.9min [GridSearchCV] C=16777216.0, gamma=2.0 ......................................... [GridSearchCV] ............... C=16777216.0, gamma=2.0, score=0.713643 -446.0min [GridSearchCV] C=16777216.0, gamma=2.0 ......................................... [GridSearchCV] ............... C=16777216.0, gamma=2.0, score=0.743628 -184.6min [GridSearchCV] C=16777216.0, gamma=2.0 ......................................... [GridSearchCV] ............... C=16777216.0, gamma=2.0, score=0.761261 -281.2min [GridSearchCV] C=16777216.0, gamma=4.0 ......................................... [GridSearchCV] ............... C=16777216.0, gamma=4.0, score=0.670165 -138.7min [GridSearchCV] C=16777216.0, gamma=4.0 ......................................... [GridSearchCV] ................ C=16777216.0, gamma=4.0, score=0.760120 -97.3min [GridSearchCV] C=16777216.0, gamma=4.0 ......................................... [GridSearchCV] ................ C=16777216.0, gamma=4.0, score=0.732733 -66.3min [GridSearchCV] C=16777216.0, gamma=8.0 ......................................... [GridSearchCV] ................ C=16777216.0, gamma=8.0, score=0.755622 -13.6min [GridSearchCV] C=16777216.0, gamma=8.0 ......................................... [GridSearchCV] ................ C=16777216.0, gamma=8.0, score=0.772114 - 4.6min [GridSearchCV] C=16777216.0, gamma=8.0 ......................................... [GridSearchCV] ................ C=16777216.0, gamma=8.0, score=0.717718 -14.7min [GridSearchCV] C=16777216.0, gamma=16.0 ........................................ [GridSearchCV] ............... C=16777216.0, gamma=16.0, score=0.763118 - 1.3min [GridSearchCV] C=16777216.0, gamma=16.0 ........................................ [GridSearchCV] ............... C=16777216.0, gamma=16.0, score=0.746627 - 25.4s [GridSearchCV] C=16777216.0, gamma=16.0 ........................................ [GridSearchCV] ............... C=16777216.0, gamma=16.0, score=0.738739 - 44.9s [Parallel(n_jobs=1)]: Done 2700 out of 2700 | elapsed: 5670.8min finished
As already mentioned about the "... regular svm.SVC().fit " kindly notice, it uses the default values [C,gamma] and therefore is not related to the behavior of your model / ProblemDOMAIN.
Re: Update
Yes, indeed, regularizing / scaling SVM inputs is a must for this AI / ML tool. scikit-learn has good hardware for creating and reusing aScalerOBJECT for a priori scaling (before aDataSET goes into .fit() ) and ex-post ad-hoc scaling, as soon as you need to re-scale the new example and send it to the predictor to respond to its magic at the request of anSvmCLASSIFIER.predict( aScalerOBJECT.transform( aNewExampleX ) )
(Yes, aNewExampleX can be a matrix, so ask for "vectorial" processing of multiple responses)
Relief performance O (M ^ 2.N ^ 1) computational complexity
In contrast to the following, suppose that the "width" problem, measured as N ==, the number of SVM functions in the matrix X should be charged with the total computation time, the SVM classifier with the rbf core is a projected problem O(M^2.N^1) .
Thus, there is a quadratic dependence on the total number of observations (examples) moved to the training phase ( .fit() ) or CrossValidation, and it can hardly be argued that a controlled training classifier will receive any better intellectual strength if one βreducesβ (only linear) "width" of the functions that themselves carry the inputs to the constructed predictive ability of the SVM classifier, right?