Ggplot: Extend the regression line to the predicted value using another type of line

Is there an easy way to extend the dashed line from the end of the solid regression line to the predicted value?

The following is my main attempt:

x = rnorm(10) y = 5 + x + rnorm(10,0,0.4) my_lm <- lm(y~x) summary(my_lm) my_intercept <- my_lm$coef[1] my_slope <- my_lm$coef[2] my_pred = predict(my_lm,data.frame(x = (max(x)+1))) ggdf <- data.frame( x = c(x,max(x)+1), y = c(y,my_pred), obs_Or_Pred = c(rep("Obs",10),"Pred") ) ggplot(ggdf, aes(x = x, y = y, group = obs_Or_Pred ) ) + geom_point( size = 3, aes(colour = obs_Or_Pred) ) + geom_abline( intercept = my_intercept, slope = my_slope, aes( linetype = obs_Or_Pred ) ) 

This does not give the result that I was hoping to see. I looked at other answers on SO and did not see anything simple. The best I came up with is:

 ggdf2 <- data.frame( x = c(x,max(x),max(x)+12), y = c(y,my_intercept+max(x)*my_slope,my_pred), obs_Or_Pred = c(rep("Obs",8),"Pred","Pred"), show_Data_Point = c(rep(TRUE,8),FALSE,TRUE) ) ggplot(ggdf2, aes(x = x, y = y, group = obs_Or_Pred ) ) + geom_point( data = ggdf2[ggdf2[,"show_Data_Point"],] ,size = 3, aes(colour = obs_Or_Pred) ) + geom_smooth( method = "lm", se=F, aes(colour = obs_Or_Pred, linetype=obs_Or_Pred) ) 

This gives a result that is correct, but I had to include an extra column that determines if I want to show data points. If I do not, I get the second of these two graphs, which has an additional point at the end of the regression line:

enter image description here

Is there an easier way to tell ggplot to predict one point from a linear model and draw a dashed line?

+5
source share
2 answers

You can draw points using only your actual data, and build a frame of forecast data to add rows. Note that max(x) appears twice, so this may be the endpoint of the Obs and Pred . We also use shape aesthetics so that we can remove the point marker that would otherwise appear in the legend key for Pred .

 # Build prediction data frame pred_x = c(min(x),rep(max(x),2),max(x)+1) pred_lines = data.frame(x=pred_x, y=predict(my_lm, data.frame(x=pred_x)), obs_Or_Pred=rep(c("Obs","Pred"), each=2)) ggplot(pred_lines, aes(x, y, colour=obs_Or_Pred, shape=obs_Or_Pred, linetype=obs_Or_Pred)) + geom_point(data=data.frame(x,y, obs_Or_Pred="Obs"), size=3) + geom_line(size=1) + scale_shape_manual(values=c(16,NA)) + theme_bw() 

enter image description here

+5
source

Half-freak: you can use scale_x_continuous(limits = to set the range of x values ​​used for forecasting. Build the prediction line first with fullrange = TRUE , then add the “observable” line at the top. Note that the invert does not display perfectly, and you can slightly increase the size of the observed line.

 ggplot(d, aes(x, y)) + geom_point(aes(color = "obs")) + geom_smooth(aes(color = "pred", linetype = "pred"), se = FALSE, method = "lm", fullrange = TRUE) + geom_smooth(aes(color = "obs", linetype = "obs"), size = 1.05, se = FALSE, method = "lm") + scale_linetype_discrete(name = "obs_or_pred") + scale_color_discrete(name = "obs_or_pred") + scale_x_continuous(limits = c(NA, max(x) + 1)) 

enter image description here


However, I tend to agree with Gregor: "ggplot is a build package, not a modeling package."

+1
source

Source: https://habr.com/ru/post/1274229/


All Articles