I developed a spam classifier using pandas and scikit to find out where it is ready for integration with our hadoop system. To do this, I need to export my classifier to a more general format than etching.
Predictive Model Markup Language (PMML) is my preferred export format. He plays very well with Cascading, which we already use. However, I suddenly cannot find python libraries that export scikit-learn models in PMML.
Has anyone had experience using this use case? Is there any alternative to PMML that will provide compatibility between scikit-learn and hadoop? What about the solid PMML export library?
source share