I could not find a simple function for this. The predict function has some code that depends on the presence of an installed model (for example, determining the rank of a model). However, we can create a function to create a fake glm object that you can use with prediction. Here is my first attempt at such a function
makeglm <- function(formula, family, data=NULL, ...) { dots <- list(...) out<-list() tt <- terms(formula, data=data) if(!is.null(data)) { mf <- model.frame(tt, data) vn <- sapply(attr(tt, "variables")[-1], deparse) if((yvar <- attr(tt, "response"))>0) vn <- vn[-yvar] xlvl <- lapply(data[vn], function(x) if (is.factor(x)) levels(x) else if (is.character(x)) levels(as.factor(x)) else NULL) attr(out, "xlevels") <- xlvl[!vapply(xlvl,is.null,NA)] attr(tt, "dataClasses") <- sapply(data[vn], stats:::.MFclass) } out$terms <- tt coef <- numeric(0) stopifnot(length(dots)>1 & !is.null(names(dots))) for(i in seq_along(dots)) { if((n<-names(dots)[i]) != "") { v <- dots[[i]] if(!is.null(names(v))) { coef[paste0(n, names(v))] <- v } else { stopifnot(length(v)==1) coef[n] <- v } } else { coef["(Intercept)"] <- dots[[i]] } } out$coefficients <- coef out$rank <- length(coef) out$qr <- list(pivot=seq_len(out$rank)) out$family <- if (class(family) == "family") { family } else if (class(family) == "function") { family() } else { stop(paste("invalid family class:", class(family))) } out$deviance <- 1 out$null.deviance <- 1 out$aic <- 1 class(out) <- c("glm","lm") out }
Thus, this function creates an object and passes all the values that predict and print expect to find on such an object. Now we can check it out. Firstly, here are some test data
set.seed(15) dd <- data.frame( X1=runif(50), X2=factor(sample(letters[1:4], 50, replace=T)), X3=rpois(50, 5), Outcome = sample(0:1, 50, replace=T) )
And we can put a standard binomial model with
mymodel<-glm(Outcome~X1+X2+X3, data=dd, family=binomial)
What gives
Call: glm(formula = Outcome ~ X1 + X2 + X3, family = binomial, data = dd) Coefficients: (Intercept) X1 X2b X2c X2d X3 -0.4411 0.8853 1.8384 0.9455 1.5059 -0.1818 Degrees of Freedom: 49 Total (ie Null); 44 Residual Null Deviance: 68.03 Residual Deviance: 62.67 AIC: 74.67
Now let's say we wanted to try the model that we read in the publication on the same data. Here we use the makeglm function
newmodel <- makeglm(Outcome~X1+X2+X3, binomial, data=dd, -.5, X1=1, X2=c(b=1.5, c=1, d=1.5), X3=-.15)
The first parameter is the model formula. This defines the answer and covariates in the same way as with glm . Then you specify the family like you with glm() . And you need to pass a data frame so that R can sniff the correct data types for each of the variables involved. It also identifies all factor variables and their levels using data.frame. Thus, it can be new data that is encoded in the same way as the built-in data.frame, or it can be original.
Now we will begin to indicate the coefficients for use in our model. Odds will be populated using parameter names. An unnamed parameter will be used as an interception. If you have a factor, you need to give coefficients to all levels by named parameters. Here, I simply decided to round the tuned grades to “good” numbers.
Now I can use our newmodel with prediction.
predict(mymodel, type="response")
Here I challenge to predict the original model and the new model using old data with my specified coefficients. We can see that the probability estimate has changed a bit.
Now I have not fully tested this function, so I use it at my own risk. I do not do such error checking, as it seems to me. Maybe someone else knows a better way.