Regression binomial data caching

I use gstat to predict binomial data, but the predicted values โ€‹โ€‹are greater than 1 and below 0. Does anyone know how I can deal with this problem? Thanks.

data(meuse) data(meuse.grid) coordinates(meuse) <- ~x+y coordinates(meuse.grid) <- ~x+y gridded(meuse.grid) <- TRUE #glm model glm.lime <- glm(lime~dist+ffreq, meuse, family=binomial(link="logit")) summary(glm.lime) #variogram of residuals var <- variogram(lime~dist+ffreq, data=meuse) fit.var <- fit.variogram(var, vgm(nugget=0.9, "Sph", range=sqrt(diff( meuse@bbox \[1,\])^2 + diff( meuse@bbox \[2,\])^2)/4, psill=var(glm.lime$residuals))) plot(var, fit.var, plot.nu=T) #universal kriging kri <- krige(lime~dist+ffreq, meuse, meuse.grid, fit.var) spplot(kri[1]) 

enter image description here

+6
source share
1 answer

In general, with this approach to regression criving, there is no guarantee that the model will be valid, since the calculation of the trend and the balances will be separated. A few notes on your code. Note that you use variogram to calculate the residual variogram, but variogram uses a normal linear model to calculate the trend and therefore also the residuals. You must determine your residuals from your glm , and then calculate the residual variogram based on this.

You can do this manually or take a look at the fit.gstatModel function from the GSIF package. You can also see binom.krige from the geoRglm package. This thread on R-sig-geo might also be interesting:

Taking balances from GLM is different from using indicator variables. There may also be some differences depending on the type of GLM residues that you accept. Launch GLM and study the remains, for example through semivariograms, I think itโ€™s normal practice, but this will not tell you the whole story. The GLGM (Genetic Linear Geostatic Model) setting can be more convincing because you can infer the model parameters and gain access to the relevance of the spatial term more objectively. This was the original motivation for geoRglm to do all the modeling at once, rather than in two steps, such as setting up the model without correlation, and then modeling the residuals. This is due to the additional burden of calibrating MCMC algorithms. Later, spBayes took the stage and really looked like promises, offering a more general structure, while geoRglm is quite specific for one-dimensional binomial and poisonous models.

According to Roger, there is the opportunity to play with other alternatives like GLMM or, perhaps, MCMCpack, but this, of course, is not ready out of the box and the code will need to be adapted for spatial purposes.

+3
source

Source: https://habr.com/ru/post/952993/


All Articles