Select features to cluster Naive Bayes into R

I want to use the naive Bayes classifier to make some predictions. So far I can make a forecast with the following (approximate) code in R

library(klaR)
library(caret)


Faktor<-x <- sample( LETTERS[1:4], 10000, replace=TRUE, prob=c(0.1, 0.2, 0.65, 0.05) )
alter<-abs(rnorm(10000,30,5))
HF<-abs(rnorm(10000,1000,200))
Diffalq<-rnorm(10000)
Geschlecht<-sample(c("Mann","Frau", "Firma"),10000,replace=TRUE)
data<-data.frame(Faktor,alter,HF,Diffalq,Geschlecht)

set.seed(5678)
flds<-createFolds(data$Faktor, 10)

train<-data[-flds$Fold01 ,]
test<-data[flds$Fold01 ,]

features <- c("HF","alter","Diffalq", "Geschlecht")

formel<-as.formula(paste("Faktor ~ ", paste(features, collapse= "+")))

nb<-NaiveBayes(formel, train, usekernel=TRUE)

pred<-predict(nb,test)

test$Prognose<-as.factor(pred$class)

Now I want to improve this model by selecting a function. My real data is about 100 functions. So my question is, what would be the best way to select the most important functions for a naive Bayesian classification? Is there a link to dor paper?

I tried the following line of code, bit it didn't work

rfe(train[, 2:5],train[, 1], sizes=1:4,rfeControl = rfeControl(functions = ldaFuncs, method = "cv"))

EDIT: it gives me the following error message

Fehler in { :   task 1 failed - "nicht-numerisches Argument für binären Operator"
Calls: rfe ... rfe.default -> nominalRfeWorkflow -> %op% -> <Anonymous>

Since it is in German, you can reproduce it on your machine.

How can I configure a call rfe()to get a recursive function exception?

+4
1

- ldaFuncs. -, .

mm <- ldaFuncs$fit(train[2:5], train[,1])
ldaFuncs$pred(mm,train[2:5])
# Error in FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...) : 
#   non-numeric argument to binary operator

, , , -.

mm <- ldaFuncs$fit(train[2:4], train[,1])
ldaFuncs$pred(mm,train[2:4])

(, , ). , , . /, .

mm <- ldaFuncs$fit(Faktor ~ alter + HF + Diffalq + Geschlecht, train)
ldaFuncs$pred(mm,train[2:5])

, . , . rfe(),

rfe(Faktor ~ alter + HF + Diffalq + Geschlecht, train, sizes=1:4,
    rfeControl =  rfeControl(functions = ldaFuncs, method = "cv"))

-

train.ex <- cbind(train[,1], model.matrix(~.-Faktor, train)[,-1])
rfe(train.ex[, 2:6],train.ex[, 1], ...)

But this does not remember which variables are soaring in the same factor, so it is not ideal.

+2
source

Source: https://habr.com/ru/post/1545873/


All Articles