Python: using multicomponent logistic regression using SKlearn

I have a test data set and a data set for trains, as shown below. I have provided sample data with minimal records, but my data has more than 1000 records. Here E is my target variable, which I need to predict using an algorithm. It has only four categories, such as 1,2,3,4. It can take only any of these values.

Training Dataset:

ABCDE 1 20 30 1 1 2 22 12 33 2 3 45 65 77 3 12 43 55 65 4 11 25 30 1 1 22 23 19 31 2 31 41 11 70 3 1 48 23 60 4 

Test Data Set:

 ABCDE 11 21 12 11 1 2 3 4 5 6 7 8 99 87 65 34 11 21 24 12 

Since E has only 4 categories, I was thinking about predicting this using multi-line logistic regression (1 versus the logic of rest). I am trying to implement it using python.

I know the logic that we need to set these goals in a variable and use an algorithm to predict any of these values:

 output = [1,2,3,4] 

But I'm stuck on how to use it using python (sklearn) to iterate over these values ​​and which algorithm should I use to predict the output values? Any help would be greatly appreciated.

+5
source share
2 answers

LogisticRegression can handle multiple classes out of the box.

 X = df[['A', 'B', 'C', 'D']] y = df['E'] lr = LogisticRegression() lr.fit(X, y) preds = lr.predict(X) # will output array with integer values. 
+5
source

You can try

 LogisticRegression(multi_class='multinomial',solver ='newton-cg').fit(X_train,y_train) 
0
source

Source: https://habr.com/ru/post/1247578/


All Articles