Save MinMaxScaler model to sklearn

Question

Save MinMaxScaler model to sklearn

I use the MinMaxScaler model in sklearn to normalize the model.

 training_set = np.random.rand(4,4)*10 training_set [[ 6.01144787, 0.59753007, 2.0014852 , 3.45433657], [ 6.03041646, 5.15589559, 6.64992437, 2.63440202], [ 2.27733136, 9.29927394, 0.03718093, 7.7679183 ], [ 9.86934288, 7.59003904, 6.02363739, 2.78294206]] scaler = MinMaxScaler() scaler.fit(training_set) scaler.transform(training_set) [[ 0.49184811, 0. , 0.29704831, 0.15972182], [ 0.4943466 , 0.52384506, 1. , 0. ], [ 0. , 1. , 0. , 1. ], [ 1. , 0.80357559, 0.9052909 , 0.02893534]]

Now I want to use the same scaler to normalize the test suite:

  [[ 8.31263467, 7.99782295, 0.02031658, 9.43249727], [ 1.03761228, 9.53173021, 5.99539478, 4.81456067], [ 0.19715961, 5.97702519, 0.53347403, 5.58747666], [ 9.67505429, 2.76225253, 7.39944931, 8.46746594]]

But I do not want to use scaler.fit() all the time with training data. Is there a way to save the scaler and load it later from another file?

+24

python scikit-learn machine-learning normalization

Luis Ramon Ramirez Rodriguez Feb 02 '17 at 3:07

source share

5 answers

Even better than pickle (which creates much larger files than this method), you can use the built-in sklearn tool:

 from sklearn.externals import joblib scaler_filename = "scaler.save" joblib.dump(scaler, scaler_filename) # And now to load... scaler = joblib.load(scaler_filename)

+59

Ivan Vegner Feb 02 '17 at 3:22

source share

You can use pickle to keep scaling:

 import pickle scalerfile = 'scaler.sav' pickle.dump(scaler, open(scalerfile, 'wb'))

Download it back:

 import pickle scalerfile = 'scaler.sav' scaler = pickle.load(open(scalerfile, 'rb')) test_scaled_set = scaler.transform(test_set)

+8

Psidom Feb 02 '17 at 3:17

source share

Just note that sklearn.externals.joblib deprecated and replaced by the old joblib , which can be installed using pip install joblib :

 import joblib joblib.dump(my_scaler, 'scaler.pkl') my_scaler = joblib.load('scaler.pkl')

Docs for joblib.dump() and joblib.load() .

+3

Engineero Jul 10 '19 at 21:09

source share

The best way to do this is to create an ML pipeline as follows:

 from sklearn.pipeline import make_pipeline from sklearn.preprocessing import MinMaxScaler from sklearn.externals import joblib pipeline = make_pipeline(MinMaxScaler(),YOUR_ML_MODEL() ) model = pipeline.fit(X_train, y_train)

Now you can save it to a file:

 joblib.dump(model, 'filename.mod')

Later you can download it like this:

 model = joblib.load('filename.mod')

+2

PSN Aug 21 '19 at 10:31

source share

jlarks32 · Accepted Answer · 2017-02-02T03:25:51+0000

So I'm not really an expert on this, but from a little research and a few useful links , I think pickle and sklearn.externals.joblib will become your friends here.

pickle allows you to save models or “dump” models to a file.

I think this link is also useful. This suggests the creation of a model of constancy. What you want to try is:

 # could use: import pickle... however let do something else from sklearn.externals import joblib # this is more efficient than pickle for things like large numpy arrays # ... which sklearn models often have. # then just 'dump' your file joblib.dump(clf, 'my_dope_model.pkl')

Here you can learn more about the appearance of the sclear.

Let me know if this does not help, or if I don’t understand something in your model.

Save MinMaxScaler model to sklearn

The best way to do this is to create an ML pipeline as follows:

Now you can save it to a file:

Later you can download it like this:

More articles: