Python - how to add numpy array to pandas framework

I prepared a logistic regression classifier to predict whether the recall is positive or negative. Now I want to add the predicted probabilities returned by the predict_proba-function function to my Pandas data frame containing the reviews. I tried to do something like:

test_data['prediction'] = sentiment_model.predict_proba(test_matrix)

Obviously this does not work as it predict_probareturns a 2D-numpy array. So what is the most efficient way to do this? I created test_matrixusing SciKit-Learn CountVectorizer:

vectorizer = CountVectorizer(token_pattern=r'\b\w+\b')
train_matrix = vectorizer.fit_transform(train_data['review_clean'].values.astype('U'))
test_matrix = vectorizer.transform(test_data['review_clean'].values.astype('U'))

Sample data is as follows:

| Review                                     | Prediction         |                      
| ------------------------------------------ | ------------------ |
| "Toy was great! Our six-year old loved it!"|   0.986            |
+4
source share
1 answer

, , pandas. x 2D numpy ,

x = sentiment_model.predict_proba(test_matrix)

,

test_data['prediction0'] = x[:,0]
test_data['prediction1'] = x[:,1]
+9

Source: https://habr.com/ru/post/1670162/


All Articles