R: GAM with data subset fitting

Question

R: GAM with data subset fitting

I am fitting a generic additive model using gam from the mgcv package. I have a data table containing my dependent variable Y , independent variable X , other independent variables Oth and two-level coefficient Fac . I would like to fit the following model

Y ~ s(X) + Oth

BUT with the additional restriction that the term s(X) is suitable only on one of two levels of the factor, say Fac==1 . Other Oth terms must match all data.

I tried to learn s(X,by=Fac) , but it Oth fit for Oth . In other words, I would like to express my belief that X refers to Y only if Fac==1 , otherwise it makes no sense to model X

+5

r gam mgcv

yannick Dec 7 '15 at 14:07

source share

2 answers

Cheap trick: use a helper variable that is X if Fac == 1 and 0 elsewhere.

 library("mgcv") library("ggplot2") # simulate data N <- 1e3 dat <- data.frame(covariate = runif(N), predictor = runif(N), group = factor(sample(0:1, N, TRUE))) dat$outcome <- rnorm(N, 1 * dat$covariate + ifelse(dat$group == 1, .5 * dat$predictor + 1.5 * sin(dat$predictor * pi), 0), .1) # some plots ggplot(dat, aes(x = predictor, y = outcome, col = group, group = group)) + geom_point() ggplot(dat, aes(x = covariate, y = outcome, col = group, group = group)) + geom_point() # create auxiliary variable dat$aux <- ifelse(dat$group == 1, dat$predictor, 0) # fit the data fit1 <- gam(outcome ~ covariate + s(predictor, by = group), data = dat) fit2 <- gam(outcome ~ covariate + s(aux, by = group), data = dat) # compare fits summary(fit1) summary(fit2)

+5

qenvio Dec 9 '15 at 17:20

source share

Maju116 · Accepted Answer · 2015-12-15T16:06:05+0000

If I understand correctly, you are thinking of some kind of model with this interaction:

 Y ~ 0th + (Fac==1)*s(X)

If you want to "express the belief that X refers to Y only if Fac==1 " does not treat Fac as a factor , but as a numeric variable. In this case, you get numeric interaction and only one set of coefficients (when it is a factor where there are two). This type of model is a varying coefficient model .

 # some data data <- data.frame(th = runif(100), X = runif(100), Y = runif(100), Fac = sample(0:1, 100, TRUE)) data$Fac<-as.numeric(as.character(data$Fac)) #change to numeric # then run model gam(Y~s(X, by=Fac)+th,data=data)

See the documentation for the by option in the ?s documentation

R: GAM with data subset fitting

More articles: