Error with train from caret package using gam method:

I have a gam model that I know works fine in R , but when I try to " train " to use the same model with caret package, it returns an error saying that the input columns are lists. Does anyone understand this?

The code that I run looks like this:

 library("caret") library("mgcv") a <- gam(RW ~ s(Temp0.grd) + s(mld.grd) + s(mean_depth.grd) + s(land_dist.grd) + s(slope.grd) + s(npp.grd), data=mydata, family=binomial) all.data.gam.train <- train(form=RW ~ s(Temp0.grd) + s(mld.grd) + s(mean_depth.grd) + s(land_dist.grd) + s(slope.grd) + s(npp.grd), data=mydata, method='gam', family=binomial ) 

The first gamma model works fine, but the train returns the following error:

  Error in model.frame.default(form = RW ~ s(Temp0.grd) + s(mld.grd) + s(mean_depth.grd) + : invalid type (list) for variable 's(Temp0.grd)' 

Running model.frame.default directly by the formula also leads to this error, so the problem is not strictly related to the train.

mydata is as follows:

 > class(mydata) [1] "data.frame" > class(mydata$Temp0.grd) [1] "numeric" > class(s(mydata$Temp0.grd)) [1] "tp.smooth.spec" > head(mydata) RW land_dist.grd mean_depth.grd mld.grd npp.grd primprod.grd Sal0.grd salbottom.grd 372 1 172 -79.83889 14.70062 1124.6136 920 31.27995 32.70 373 0 157 -84.53555 14.70062 973.1954 889 31.27995 32.70 374 1 146 -91.53111 14.70062 896.5736 803 31.38220 32.59 375 1 137 -89.44222 14.70062 783.4132 719 31.38220 32.59 405 1 173 -100.87666 14.70062 1010.4898 755 31.27995 32.70 406 1 197 -104.24111 14.70062 816.1457 767 31.27995 32.70 salsurf.grd seamounts_dist.grd slope.grd sst.grd Temp0.grd Temp100.grd Temp50.grd 372 30.36 1529.184 16.068041 1.77 6.532125 0.31340000 0.36470 373 30.36 1513.419 16.317524 1.77 6.532125 0.31340000 0.36470 374 30.68 1496.227 8.578011 1.68 6.466700 0.01937502 -0.04645 375 30.68 1479.382 8.134535 1.68 6.466700 0.01937502 -0.04645 405 30.36 1483.972 18.345858 1.77 6.532125 0.31340000 0.36470 406 30.36 1474.469 13.433269 1.77 6.532125 0.31340000 0.36470 tempbottom.grd 372 1.58 373 1.58 374 1.23 375 1.23 405 1.58 406 1.58 

For information, my installation of R is as follows:

 > sessionInfo() R version 3.0.2 (2013-09-25) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] mgcv_1.7-27 nlme_3.1-111 caret_5.16-04 reshape2_1.2.2 plyr_1.8 [6] lattice_0.20-24 foreach_1.4.0 cluster_1.14.4 loaded via a namespace (and not attached): [1] codetools_0.2-8 grid_3.0.2 iterators_1.0.6 Matrix_1.1-0 stringr_0.6.2 [6] tools_3.0.2 

Thanks for the help!

+6
source share
1 answer

When you use train with this model, you cannot (at this time) specify the gam formula. caret has an internal function that calculates a formula based on how many unique levels each predictor has, etc. In other words, train currently defines which terms are smoothed out and which simple old linear main effects.

Try using the same code without a smooth expression in the train formula and see if an error leads to it.

The next version of caret (possibly at the beginning of the year) will give you much more flexibility to create your own formula using GAM and other models.

Max

+3
source

Source: https://habr.com/ru/post/958295/


All Articles