Multiple logistic regression with the interaction of quantitative and qualitative explanatory variables

As a continuation of this question, I applied multiple logistic regression with the interaction between quantitative and qualitative explanatory variables. MWE is shown below:

Type <- rep(x=LETTERS[1:3], each=5) Conc <- rep(x=seq(from=0, to=40, by=10), times=3) Total <- 50 Kill <- c(10, 30, 40, 45, 38, 5, 25, 35, 40, 32, 0, 32, 38, 47, 40) df <- data.frame(Type, Conc, Total, Kill) fm1 <- glm( formula = Kill/Total~Type*Conc , family = binomial(link="logit") , data = df , weights = Total ) summary(fm1) Call: glm(formula = Kill/Total ~ Type * Conc, family = binomial(link = "logit"), data = df, weights = Total) Deviance Residuals: Min 1Q Median 3Q Max -4.871 -2.864 1.204 1.706 2.934 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.65518 0.23557 -2.781 0.00541 ** TypeB -0.34686 0.33677 -1.030 0.30302 TypeC -0.66230 0.35419 -1.870 0.06149 . Conc 0.07163 0.01152 6.218 5.04e-10 *** TypeB:Conc -0.01013 0.01554 -0.652 0.51457 TypeC:Conc 0.03337 0.01788 1.866 0.06201 . --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 277.092 on 14 degrees of freedom Residual deviance: 96.201 on 9 degrees of freedom AIC: 163.24 Number of Fisher Scoring iterations: 5 anova(object=fm1, test="LRT") Analysis of Deviance Table Model: binomial, link: logit Response: Kill/Total Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev Pr(>Chi) NULL 14 277.092 Type 2 6.196 12 270.895 0.04513 * Conc 1 167.684 11 103.211 < 2e-16 *** Type:Conc 2 7.010 9 96.201 0.03005 * --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 df$Pred <- predict(object=fm1, data=df, type="response") df1 <- with(data=df, expand.grid(Type=levels(Type) , Conc=seq(from=min(Conc), to=max(Conc), length=51) ) ) df1$Pred <- predict(object=fm1, newdata=df1, type="response") library(ggplot2) ggplot(data=df, mapping=aes(x=Conc, y=Kill/Total, color=Type)) + geom_point() + geom_line(data=df1, mapping=aes(x=Conc, y=Pred), linetype=2) + geom_hline(yintercept=0.5,col="gray") 

enter image description here

I want to calculate LD50 , LD90 and LD95 at their confidence intervals. Since the interaction is significant, so I want to calculate the LD50 , LD90 and LD95 with their confidence intervals for each Type (A, B, and C) separately.



LD stands for lethal dose. This is the amount of substance needed to kill X% (LD50 = 50%) of the test population.

Edited Type is a qualitative variable representing various types of drugs, and Conc is a quantitative variable representing various concentrations of drugs.

+5
source share
2 answers

You use the drc package to match dose-response logic models.

First choose a model

 require(drc) mod <- drm(Kill/Total ~ Conc, curveid = Type, weights = Total, data = df, fct = L.4(fixed = c(NA, 0, 1, NA)), type = 'binomial') 

Here curveid= indicates the grouping of data, and fct= indicates a 4-digit logistic function with parameters for the lower and upper links fixed at 0 and 1.

Please note that the differences from glm minor:

 df2 <- with(data=df, expand.grid(Conc=seq(from=min(Conc), to=max(Conc), length=51), Type=levels(Type))) df2$Pred <- predict(object=mod, newdata = df2) 

Here's a histogram of differences with glm prediction

 hist(df2$Pred - df1$Pred) 

enter image description here

Estimate effective doses (and CI) from the model

This is easy with the ED() function:

 ED(mod, c(50, 90, 95), interval = 'delta') Estimated effective doses (Delta method-based confidence interval(s)) Estimate Std. Error Lower Upper A:50 9.1468 2.3257 4.5885 13.705 A:90 39.8216 4.3444 31.3068 48.336 A:95 50.2532 5.8773 38.7338 61.773 B:50 16.2936 2.2893 11.8067 20.780 B:90 52.0214 6.0556 40.1527 63.890 B:95 64.1714 8.0068 48.4784 79.864 C:50 12.5477 1.5568 9.4963 15.599 C:90 33.4740 2.7863 28.0129 38.935 C:95 40.5904 3.6006 33.5334 47.648 

For each group we get ED50, ED90 and ED95 with CI.

+4
source

Your link selection function (\ eta = X \ hat \ beta) has variance for the new observation (x_0): V_ {x_0} = x_0 ^ T (X ^ TWX) ^ {- 1} x_0

Thus, for a set of candidate doses, the expected percentage of deaths can be predicted using the inverse function:

 newdata= data.frame(Type=rep(x=LETTERS[1:3], each=5), Conc=rep(x=seq(from=0, to=40, by=10), times=3)) mm <- model.matrix(fm1, newdata) # get link on link terms (could also use predict) eta0 <- apply(mm, 1, function(i) sum(i * coef(fm1))) # inverse logit function ilogit <- function(x) return(exp(x) / (1+ exp(x))) # predicted probs ilogit(eta0) # for comfidence intervals we can use a normal approximation lethal_dose <- function(mod, newdata, alpha) { qn <- qnorm(1 - alpha /2) mm <- model.matrix(mod, newdata) eta0 <- apply(mm, 1, function(i) sum(i * coef(fm1))) var_mod <- vcov(mod) se <- apply(mm, 1, function(x0, var_mod) { sqrt(t(x0) %*% var_mod %*% x0)}, var_mod= var_mod) out <- cbind(ilogit(eta0 - qn * se), ilogit(eta0), ilogit(eta0 + qn * se)) colnames(out) <- c("LB_CI", "point_est", "UB_CI") return(list(newdata=newdata, eff_dosage= out)) } lethal_dose(fm1, newdata, alpha= 0.05)$eff_dosage $eff_dosage LB_CI point_est UB_CI 1 0.2465905 0.3418240 0.4517820 2 0.4361703 0.5152749 0.5936215 3 0.6168088 0.6851225 0.7462674 4 0.7439073 0.8166343 0.8722545 5 0.8315325 0.9011443 0.9439316 6 0.1863738 0.2685402 0.3704385 7 0.3289003 0.4044270 0.4847691 8 0.4890420 0.5567386 0.6223914 9 0.6199426 0.6990808 0.7679095 10 0.7207340 0.8112133 0.8773662 11 0.1375402 0.2112382 0.3102215 12 0.3518053 0.4335213 0.5190198 13 0.6104540 0.6862145 0.7531978 14 0.7916268 0.8620545 0.9113443 15 0.8962097 0.9469715 0.9736370 

Instead of doing it manually, you can also manipulate:

predict.glm(fm1, newdata, se=TRUE)$se.fit

+3
source

Source: https://habr.com/ru/post/1245250/


All Articles