I created a decision tree model in R. The target variable is Salary, where we are trying to predict whether a person’s salary is above or below 50 thousand based on other input variables
df<-salary.data train = sample(1:nrow(df), nrow(df)/2) train = sample(1:nrow(df), size=0.2*nrow(df)) test = - train training_data = df[train, ] testing_data = df[test, ] fit <- rpart(training_data$INCOME ~ ., method="class", data=training_data)
After that I tried to create a gain diagram by doing the following
# Gain Chart pred <- prediction(testing_data$predictionsOutput, testing_data$INCOME) gain <- performance(pred,"tpr","fpr") plot(gain, col="orange", lwd=2)
Studying the link, I can’t understand how to use the ROCR package to build a chart using the Prediction function. Is it only for binary target variables? I get the error "Prediction format is invalid"
Any help with this would be greatly appreciated to help me build a gain diagram for the above model. Thanks!!
AGE EMPLOYER DEGREE MSTATUS JOBTYPE SEX C.GAIN C.LOSS HOURS 1 39 State-gov Bachelors Never-married Adm-clerical Male 2174 0 40 2 50 Self-emp-not-inc Bachelors Married-civ-spouse Exec-managerial Male 0 0 13 3 38 Private HS-grad Divorced Handlers-cleaners Male 0 0 40 COUNTRY INCOME 1 United-States <=50K 2 United-States <=50K 3 United-States <=50K
source share