LC50 / LD50 confidence intervals from multiple regression glm with interaction

Question

LC50 / LD50 confidence intervals from multiple regression glm with interaction

I have a quasi-binomial glm with two continuous explanatory variables (say, "LogPesticide" and "LogFood") and interaction. I would like to calculate the LC50 of a pesticide at confidence intervals in different amounts of food (for example, minimum and maximum nutritional values). How can this be achieved?

Example: first I create a dataset.

mydata <- data.frame( LogPesticide = rep(log(c(0, 0.1, 0.2, 0.4, 0.8, 1.6) + 0.05), 4), LogFood = rep(log(c(1, 2, 4, 8)), each = 6) ) set.seed(seed=16) growth <- function(x, a = 1, K = 1, r = 1) { # Logistic growth function. a = position of turning point Fx <- (K * exp(r * (x - a))) / (1 + exp(r * (x - a))) # K = carrying capacity return(Fx) # r = growth rate (larger r -> narrower curve) } y <- rep(NA, length = length(mydata$LogPesticide)) y[mydata$LogFood == log(1)] <- growth(x = mydata$LogPesticide[mydata$LogFood == log(1)], a = log(0.1), K = 1, r = 6) y[mydata$LogFood == log(2)] <- growth(x = mydata$LogPesticide[mydata$LogFood == log(2)], a = log(0.2), K = 1, r = 4) y[mydata$LogFood == log(4)] <- growth(x = mydata$LogPesticide[mydata$LogFood == log(4)], a = log(0.4), K = 1, r = 3) y[mydata$LogFood == log(8)] <- growth(x = mydata$LogPesticide[mydata$LogFood == log(8)], a = log(0.8), K = 1, r = 1) mydata$Dead <- rbinom(n = length(y), size = 20, prob = y) mydata$Alive <- 20 - mydata$Dead mydata$Mortality <- cbind(mydata$Dead, mydata$Alive)

Then I put the full glm. Diagnostics of the model is in order, and all conditions of interaction are significant.

 model <- glm(Mortality ~ LogPesticide * LogFood, family = quasibinomial, data = mydata) plot(model) Anova(model) summary(model)

I tried to evaluate LC50s with a dose of .p () from the MASS package. If LogFood was a factor, it would work when I fit the model again, as discussed in this post . But with two continuous explanatory variables, you get only 1 interception, 2 slopes and the difference of the slopes (for interaction).

I can evaluate LC50s with effect (), but don't know how to get the associated CI for LogPesticide.

 # Create a set of fitted values. library(effects) food.min <- round(min(model$model$LogFood), 3) food.max <- round(max(model$model$LogFood), 3) eff <- effect("LogPesticide:LogFood", model, xlevels = list(LogPesticide = seq(min(model$model$LogPesticide), max(model$model$LogPesticide), length = 100), LogFood = c(food.min, food.max))) eff2 <- as.data.frame(eff) # Find fitted values closest to 0.5 when LogFood is minimal and maximal. fit.min <- which.min(abs(eff2$fit[eff2$LogFood == food.min] - 0.5)) fit.min <- eff2$fit[eff2$LogFood == food.min][fit.min] fit.max <- which.min(abs(eff2$fit[eff2$LogFood == food.max] - 0.5)) fit.max <- eff2$fit[eff2$LogFood == food.max][fit.max] # Use those fitted values to predict the LC50s. lc50.min <- eff2$LogPesticide[eff2$fit == fit.min & eff2$LogFood == food.min] lc50.max <- eff2$LogPesticide[eff2$fit == fit.max & eff2$LogFood == food.max] # Plot the results. plot(fit ~ LogPesticide, data = eff2[eff2$LogFood == round(min(model$model$LogFood), 3),], type = "l") lines(fit ~ LogPesticide, data = eff2[eff2$LogFood == round(max(model$model$LogFood), 3),], col = "red") points(y = 0.5, x = lc50.min, pch = 19) points(y = 0.5, x = lc50.max, pch = 19, col = "red")

From the dose code .p () I see that you need to use the vcov matrix. effect () also provides the vcov matrix, but I could not change the dose of .p () to work correctly with this information. I would be grateful for any ideas!

+4

r logistic-regression glm

Jeremias Feb 17 '16 at 16:18

source share

1 answer

Ben bolker · Answer 1 · 2016-02-23T22:19:38+0000

Data replication ( update : new version of ggplot2 may not like strange data frames with matrices in them?)

 mydata <- data.frame( LogPesticide = rep(log(c(0, 0.1, 0.2, 0.4, 0.8, 1.6) + 0.05), 4), LogFood = rep(log(c(1, 2, 4, 8)), each = 6) ) set.seed(seed=16) growth <- function(x, a = 1, K = 1, r = 1) { ## Logistic growth function. a = position of turning point ## K = carrying capacity ## r = growth rate (larger r -> narrower curve) return((K * exp(r * (x - a))) / (1 + exp(r * (x - a)))) } rlf <- data.frame(LogFood=log(c(1,2,4,8)), a=log(c(0.1,0.2,0.4,0.8)), r=6,4,3,1) mydata <- merge(mydata,rlf) mydata <- plyr::mutate(mydata, y=growth(LogPesticide,a,K=1,r), Dead=rbinom(n=nrow(mydata),size=20,prob=y), N=20, Alive=N-Dead, pmort=Dead/N) model <- glm(pmort ~ LogPesticide * LogFood, family = quasibinomial, data = mydata, weights=N)

For comfort:

 cc <- setNames(coef(model),c("b_int","b_P","b_F","b_PF")) vv <- vcov(model) dimnames(vv) <- list(names(cc),names(cc))

Baseline frame of forecast data:

 pframe <- with(mydata, expand.grid(LogPesticide=seq(min(LogPesticide),max(LogPesticide), length=51), LogFood=unique(LogFood))) pframe$pmort <- predict(model,newdata=pframe,type="response")

Now let me break it. The predicted value at this level of (log) food F and (log) pesticide P is

 logit(surv) = b_int + b_P*P + b_F*F + b_PF*F*P

Thus, the logistic curve for pesticide at level F is

 logit(surv) = (b_int+b_F*F) + (b_P+b_PF*F)*P

We want to know the value of P for which logit (surv) is 0 (LC50), so we need

 0 = (b_int+b_F*F) + (b_P+b_PF*F)*P50 P50 = -(b_int+b_F*F)/(b_P+b_PF*F)

Code translation:

 P50mean <- function(logF) { with(as.list(cc), -(b_int+b_F*logF)/(b_P+b_PF*logF)) } with(mydata,P50mean(c(min=min(LogFood),max=max(LogFood)))) pLC50 <- data.frame(LogFood=unique(mydata$LogFood)) pLC50 <- transform(pLC50, pmort=0.5, LogPesticide=P50mean(LogFood))

To obtain confidence intervals, the two easiest methods are (1) the delta method and (2) retrospective prediction intervals (also referred to as “parametric bytes” in some contexts). (You can also perform nonparametric loading.)

Delta h2 method>

I tried to do it manually, but realized that it was getting too hairy (all four coefficients are strongly correlated, and all these correlations need to be tracked in the calculation - it's not as simple as the usual formulas, where the numerator and denominator are independent values ...)

 library("emdbook") deltavar(-(b_int+b_F*2)/(b_P+b_PF*2),meanval=cc,Sigma=vv) ## have to be a bit fancy here with eval/substitute ... pLC50$var1 <- sapply(pLC50$LogFood, function(logF) eval(substitute( deltavar(-(b_int+b_F*logF)/(b_P+b_PF*logF), meanval=cc,Sigma=vv), list(logF=logF))))

Population prediction intervals

This suggests (slightly weaker) that the distribution of sample parameters is multidimensional.

 PP <- function(logF,n=1000) { b <- MASS::mvrnorm(n,mu=cc,Sigma=vv) pred <- with(as.data.frame(b), -(b_int+b_F*logF)/(b_P+b_PF*logF)) return(var(pred)) } set.seed(101) pLC50$var2 <- sapply(pLC50$LogFood,PP)

PPI would actually allow us to weaken our assumptions a bit by obtaining the quantiles of the distribution of the projected LC50 ... as it turns out (see below), PPI-based confidence intervals are slightly wider than Delta but they are not terribly far apart.

Now sketch the whole mess:

 library(ggplot2); theme_set(theme_bw()) gg0 <- ggplot(mydata,aes(LogPesticide,pmort, colour=factor(LogFood), fill = factor(LogFood))) + geom_point() + ## individual fits -- a bit ugly ## geom_smooth(method="glm",aes(weight=N), ## method.args=list(family=binomial),alpha=0.1)+ geom_line(data=pframe,linetype=2)+ geom_point(data=pLC50,pch=5,size=4)+ geom_hline(yintercept=0.5,col="gray") gg0 + geom_errorbarh(data=pLC50,lwd=2,alpha=0.5, aes(xmin=LogPesticide-1.96*sqrt(var1), xmax=LogPesticide+1.96*sqrt(var1)), height=0)+ geom_errorbarh(data=pLC50, aes(xmin=LogPesticide-1.96*sqrt(var2), xmax=LogPesticide+1.96*sqrt(var2)), height=0.02)

LC50 / LD50 confidence intervals from multiple regression glm with interaction

Delta h2 method>

Population prediction intervals

More articles: