Feature Union of heterogeneous features

I have 3 different sets of functions for this set of audio files. Each of them is a matrix of features that are stored as an array of sizes as follows:

  • function 1: (978 * 153)
  • function 2: (978 * 800)
  • function 3: (978 * 12)

Each of these functions was extracted from audio files using various methods.

What I would like to do is put them together in a given classifier. (using the conveyor). I read this , this and the blog link in link 2, but it deals with various extraction methods and then uses classifiers. Since I already have the extracted data, as indicated above, I would like to know what to do next, that is, how to combine them into a pipeline.

I know that it cannot request direct code here. I just need pointers. How to combine data (possibly using a pipeline) that are extracted from different methods to classify them, for example, using SVM.

+4
source share
1 answer

Assuming you want to deal with a set of functions in independent models, and then combine their results together, I will write the answer below. However, if you just want to use the functions from all three data extraction methods in one model, just add them together to the same data set and use them for training.

, Pipeline - (978 * 965) pandas DataFrame, . , , , :

class VarSelect(BaseEstimator, TransformerMixin):
    def __init__(self, keys):
        self.keys = keys
    def fit(self, x, y=None):
        return self
    def transform(self, df):
        return df[self.keys].values

, , , ( ). - ( ):

class ModelClassTransformer(BaseEstimator, TransformerMixin):
    def __init__(self, model):
        self.model = model
    def fit(self, *args, **kwargs):
        self.model.fit(*args, **kwargs)
        return self
    def transform(self, X, **transform_params):
        return DataFrame(self.model.predict_proba(X))

class ModelRegTransformer(BaseEstimator, TransformerMixin):
    def __init__(self, model):
        self.model = model
    def fit(self, *args, **kwargs):
        self.model.fit(*args, **kwargs)
        return self
    def transform(self, X, **transform_params):
        return DataFrame(self.model.predict(X))

, , , , . , SVM ( ), :

Pipeline([
    ('union', FeatureUnion([
        ('modelA', Pipeline([
            ('var', VarSelect(keys=vars_a)),
            ('scl', StandardScaler(copy=True, with_mean=True, with_std=True)),
            ('svm', ModelRegTransformer(SVC(kernel='rbf')))),
        ])),
        ('modelB', Pipeline([
            ('var', VarSelect(keys=vars_b)),
            ('scl', StandardScaler(copy=True, with_mean=True, with_std=True)),
            ('svm', ModelRegTransformer(SVC(kernel='rbf'))),
        ])),
        ('modelC', Pipeline([
            ('var', VarSelect(keys=vars_c)),
            ('scl', StandardScaler(copy=True, with_mean=True, with_std=True)),
            ('svm', ModelRegTransformer(SVC(kernel='rbf'))),
        ]))
    ])),
    ('scl', StandardScaler(copy=True, with_mean=True, with_std=True)),
    ('svm', SVC(kernel='rbf'))
])
+6

Source: https://habr.com/ru/post/1610177/


All Articles