I have a test data set and a data set for trains, as shown below. I have provided sample data with minimal records, but my data has more than 1000 records. Here E is my target variable, which I need to predict using an algorithm. It has only four categories, such as 1,2,3,4. It can take only any of these values.
Training Dataset:
ABCDE 1 20 30 1 1 2 22 12 33 2 3 45 65 77 3 12 43 55 65 4 11 25 30 1 1 22 23 19 31 2 31 41 11 70 3 1 48 23 60 4
Test Data Set:
ABCDE 11 21 12 11 1 2 3 4 5 6 7 8 99 87 65 34 11 21 24 12
Since E has only 4 categories, I was thinking about predicting this using multi-line logistic regression (1 versus the logic of rest). I am trying to implement it using python.
I know the logic that we need to set these goals in a variable and use an algorithm to predict any of these values:
output = [1,2,3,4]
But I'm stuck on how to use it using python (sklearn) to iterate over these values ββand which algorithm should I use to predict the output values? Any help would be greatly appreciated.
source share