Glmulti Oversized Recruitment

Question

Glmulti Oversized Recruitment

Error message:

SYSTEM: win7 / 64bit / ultimate / 16gb-real-ram plus virtual memory, memory.limit (32000)

What does this error message mean?
In glmulti (y = "y", data = mydf, xr = c ("x1",:! An oversized set of candidates.
mydf has 3.6 mm rows and 150 columns of floats
What steps should be taken to bypass it in glmulti?
Any glmulti alternatives in the R world?

R / 64bit "Good Sport"

+6

memory-management feature-selection

Yu Le Jul 18 '13 at 18:24

source share

1 answer

Phill · Answer 1 · 2014-05-26T21:58:47+0000

I ran into the same problem, here is what I have discovered so far:

The number of rows does not seem to be a problem. The problem is that with 150 predictors, the package cannot handle an exhaustive search (that is, take a look and compare all possible models). In my experience, your specific "Oversized Candidate Set" error message is caused by the fact that you also allow pairwise interactions ( level=2 , set level=1 to prevent interaction). Then you are likely to come across the warning “Too many predictors.” In my (very limited) experiment, I found that the maximum number of models that I got to work in the candidate set was about a billion models (in particular: 30 covariances are 1,073,741,824 based on 2 ^ n to calculate possible combinations (n = 30).). Here is the code I used to evaluate this
out <integer(50) for(i in 2:40) out[i]<-glmulti(names(data)[1], names(data)[2:i], method="d", level=1, crit=aic, data=data)
when the cycle falls into 31 covariance, the set of candidates returns with 0 models. 33, and then it starts to return a warning message. My “data” had about 100 variables and only about 1000 rows, but, as I said, the problem is the width of the data set, not the depth.
As I said, start by eliminating the interaction, and then consider using other methods to reduce variables first to reduce the number of variables (factor analysis / core components or clustering). The problem is with those who lose some explainability, but retain predictive power.
glmuttil documentation compares the package with alternatives, highlighting their use cases, advantages and disadvantages.

PS: I ran my material on Win7, 64 bit, 16GB Ram, R version: 3.10 glmutil 1.07. PPS: Last year, the author of the package released version 2.0, which fixed some of these problems. Read more in the source.

Glmulti Oversized Recruitment

More articles: