Python: using multicomponent logistic regression using SKlearn

Question

Python: using multicomponent logistic regression using SKlearn

I have a test data set and a data set for trains, as shown below. I have provided sample data with minimal records, but my data has more than 1000 records. Here E is my target variable, which I need to predict using an algorithm. It has only four categories, such as 1,2,3,4. It can take only any of these values.

Training Dataset:

ABCDE 1 20 30 1 1 2 22 12 33 2 3 45 65 77 3 12 43 55 65 4 11 25 30 1 1 22 23 19 31 2 31 41 11 70 3 1 48 23 60 4

Test Data Set:

 ABCDE 11 21 12 11 1 2 3 4 5 6 7 8 99 87 65 34 11 21 24 12

Since E has only 4 categories, I was thinking about predicting this using multi-line logistic regression (1 versus the logic of rest). I am trying to implement it using python.

I know the logic that we need to set these goals in a variable and use an algorithm to predict any of these values:

 output = [1,2,3,4]

But I'm stuck on how to use it using python (sklearn) to iterate over these values and which algorithm should I use to predict the output values? Any help would be greatly appreciated.

+5

python scikit-learn data-analysis logistic-regression

Sriram chandramouli Apr 21 '16 at 4:56

source share

2 answers

dukebody · Answer 1 · 2016-04-23T18:06:42+0000

LogisticRegression can handle multiple classes out of the box.

 X = df[['A', 'B', 'C', 'D']] y = df['E'] lr = LogisticRegression() lr.fit(X, y) preds = lr.predict(X) # will output array with integer values.

Daisy qin · Answer 2 · 2017-03-18T18:55:26+0000

You can try

 LogisticRegression(multi_class='multinomial',solver ='newton-cg').fit(X_train,y_train)

Python: using multicomponent logistic regression using SKlearn

More articles: