Classification in R

I am trying to classify naive bikes in R. I saw this example in the following link.

http://en.wikibooks.org/wiki/Data_Mining_Algorithms_In_R/Classification/Na%C3%AFve_Bayes

Only 2 lines. Classify first, and then predict.

> classifier<-naiveBayes(iris[,1:4], iris[,5]) > table(predict(classifier, iris[,-5]), iris[,5]) 

The same "iris dataset" code works fine. But when I applied the same in my dataset, I get some errors.

My dataset contains 4 attributes and the 4th attribute of a class attribute.

 > str(data1) 'data.frame': 1370 obs. of 4 variables: $ TenScore : num 85 84.2 67.2 91.5 79.3 ... $ TwelthScore : num 69 87.9 67.5 82.7 72.4 ... $ GDegreeScore : num 63.3 70.7 61.3 78.2 62.1 ... $ Got_Admission: chr "No" "No" "No" "No" ... 

So, I tried this.

 > classifier<-naiveBayes(data1[,1:3], data1[,4]) > table(predict(classifier, data1[,-4]), data1[,4]) Error in table(predict(classifier, data1[, -4]), data1[, 4]) : all arguments must have the same length 

I get the above errors when I execute the command. When I just use predict, it gives me the following result.

 > predict(classifier, data1[,-4]) factor(0) Levels: str(data1) 'data.frame': 1370 obs. of 4 variables: $ TenScore : num 85 84.2 67.2 91.5 79.3 ... $ TwelthScore : num 69 87.9 67.5 82.7 72.4 ... $ GDegreeScore : num 63.3 70.7 61.3 78.2 62.1 ... $ Got_Admission: chr "No" "No" "No" "No" ... 

Please explain to me which errors and how to solve them.

+4
source share
1 answer

I can make the same error by changing the 5th column of the iris to a character:

 > iris[ , 5] <- as.character(iris[ , 5] ) > classifier<-naiveBayes(iris[,1:4], iris[,5]) > table(predict(classifier, iris[,-5]), iris[,5]) Error in table(predict(classifier, iris[, -5]), iris[, 5]) : all arguments must have the same length # The fix --------> iris[ , 5] <- factor(as.character(iris[ , 5] )) classifier<-naiveBayes(iris[,1:4], iris[,5]) table(predict(classifier, iris[,-5]), iris[,5]) # ---- output-------- setosa versicolor virginica setosa 50 0 0 versicolor 0 47 3 virginica 0 3 47 

So you have to do this:

  data1$ Got_Admission <- factor(data1$ Got_Admission) 

If your "Got_Admission" column is out of order, you will get incomprehensible results (GIGO effect). You must first view the contents with:

 table(data1$ Got_Admission) 
+3
source

Source: https://habr.com/ru/post/1380818/


All Articles