Workaround to exclude 32- / 64-bit serialization in the Sklearn RandomForest model

If we serialize the randomforest model using joblib on a 64-bit machine, and then unpack it on a 32-bit machine, an exception occurs:

ValueError: Buffer dtype mismatch, expected 'SIZE_t' but got 'long long'

This question has been asked before: Scikits-Learn RandomForrest trained on a 64-bit python will not open on a 32-bit python . But since 2014, the question has not been answered.

Sample code to learn the model (on a 64-bit machine):

modelPath="../"
featureVec=...
labelVec = ...
forest = RandomForestClassifier()
randomSearch = RandomizedSearchCV(forest, param_distributions=param_dict, cv=10, scoring='accuracy',
                                      n_iter=100, refit=True)
randomSearch.fit(X=featureVec, y=labelVec)
model = randomSearch.best_estimator_
joblib.dump(model, modelPath)

Sample code for unpacking a 32-bit machine:

modelPath="../"
model = joblib.load(modelPkl) # ValueError thrown here

My question is: is there a general way to solve this problem if we need to study on a 64-bit machine and transfer it to a 32-bit machine for forecasting?

: joblib. - . ( joblib, pickle):

  File "/usr/lib/python2.7/pickle.py", line 1378, in load
    return Unpickler(file).load()
  File "/usr/lib/python2.7/pickle.py", line 858, in load
    dispatch[key](self)
  File "/usr/lib/python2.7/pickle.py", line 1133, in load_reduce
    value = func(*args)
  File "sklearn/tree/_tree.pyx", line 585, in sklearn.tree._tree.Tree.__cinit__ (sklearn/tree/_tree.c:7286)
ValueError: Buffer dtype mismatch, expected 'SIZE_t' but got 'long long'
+4

Source: https://habr.com/ru/post/1653068/


All Articles