Piecewise regression with a quadratic polynomial and a line smoothly connecting at the point of discontinuity

I want to fit a piecewise linear regression with one break point xt , so for x < xt we have a quadratic polynomial, and for x >= xt we have a straight line. The two parts should smoothly connect with continuity up to the 1st derivative at xt . Here is a picture of how it might look:

piecewise regression

I parameterize my piecewise regression function as:

regression function

where a , b , c and xt are the parameters that need to be estimated.

I want to compare this model with quadratic polynomial regression over the entire range in terms of the adjusted R-square.

Here are my details:

 y <- c(1, 0.59, 0.15, 0.078, 0.02, 0.0047, 0.0019, 1, 0.56, 0.13, 0.025, 0.0051, 0.0016, 0.00091, 1, 0.61, 0.12, 0.026, 0.0067, 0.00085, 4e-04) x <- c(0, 5.53, 12.92, 16.61, 20.3, 23.07, 24.92, 0, 5.53, 12.92, 16.61, 20.3, 23.07, 24.92, 0, 5.53, 12.92, 16.61, 20.3, 23.07, 24.92) 

scatter plot

My attempt is as follows, for the famous xt :

 z <- pmax(0, x - xt) x1 <- pmin(x, xt) fit <- lm(y ~ x1 + I(x1 ^ 2) + z - 1) 

But the straight line does not seem tangent to the quadratic polynomial in xt . Where am I doing wrong?


Related questions:

+5
source share
3 answers

In this section, I will demonstrate a reproducible example. Please make sure you have the functions defined in another answer.

 ## we first generate a true model set.seed(0) x <- runif(100) ## sample points on [0, 1] beta <- c(0.1, -0.2, 2) ## true coefficients X <- getX(x, 0.6) ## model matrix with true break point at 0.6 y <- X %*% beta + rnorm(100, 0, 0.08) ## observations with Gaussian noise plot(x, y) 

scatter plot

Now suppose we do not know c , and we would like to search in a uniformly distributed grid:

 c.grid <- seq(0.1, 0.9, 0.05) fit <- choose.c(x, y, c.grid) fit$c 

choose c

RSS chose 0.55. This is slightly different from the true value of 0.6 , but from the graph we see that the RSS curve is not much different between [0.5, 0.6] so I am pleased with this.

Received model fit rich information:

 #List of 12 # $ coefficients : num [1:3] 0.114 -0.246 2.366 # $ residuals : num [1:100] 0.03279 -0.01515 0.21188 -0.06542 0.00763 ... # $ fitted.values: num [1:100] 0.0292 0.3757 0.2329 0.1087 0.0263 ... # $ R : num [1:3, 1:3] -10 0.1 0.1 0.292 2.688 ... # $ sig2 : num 0.00507 # $ coef.table : num [1:3, 1:4] 0.1143 -0.2456 2.3661 0.0096 0.0454 ... # ..- attr(*, "dimnames")=List of 2 # .. ..$ : chr [1:3] "beta0" "beta1" "beta2" # .. ..$ : chr [1:4] "Estimate" "Std. Error" "t value" "Pr(>|t|)" # $ aic : num -240 # $ bic : num -243 # $ c : num 0.55 # $ RSS : num 0.492 # $ r.squared : num 0.913 # $ adj.r.squared: num 0.911 

We can extract the pivot table for the coefficients:

 fit$coef.table # Estimate Std. Error t value Pr(>|t|) #beta0 0.1143132 0.009602697 11.904286 1.120059e-20 #beta1 -0.2455986 0.045409356 -5.408546 4.568506e-07 #beta2 2.3661097 0.169308226 13.975161 5.730682e-25 

Finally, we want to see some predictive plot.

 x.new <- seq(0, 1, 0.05) p <- pred(fit, x.new) head(p) # fit se.fit lwr upr #[1,] 0.9651406 0.02903484 0.9075145 1.0227668 #[2,] 0.8286400 0.02263111 0.7837235 0.8735564 #[3,] 0.7039698 0.01739193 0.6694516 0.7384880 #[4,] 0.5911302 0.01350837 0.5643199 0.6179406 #[5,] 0.4901212 0.01117924 0.4679335 0.5123089 #[6,] 0.4009427 0.01034868 0.3804034 0.4214819 

We can make a plot:

 plot(x, y, cex = 0.5) matlines(x.new, p[,-2], col = c(1,2,2), lty = c(1,2,2), lwd = 2) 

confidence band

+6
source

This is an excellent exercise (perhaps difficult) to digest the theory and implementation of linear models. My answer will consist of two parts:

  • Part 1 (this) introduces the parameterization used and how this piecewise regression reduces to the usual least square problem. R functions are provided, including model evaluation, breakpoint selection, and prediction.
  • Part 2 (another) demonstrates a toy reproducible example of how to use the functions that I defined.

I have to use a different parameterization, because the one you gave in your question is incorrect! Your parameterization ensures only the continuity of the function value, but not the first derivative! This is why your set line does not touch the corresponding quadratic polynomial in xt .


Parameterization

 ## generate design matrix getX <- function (x, c) { x <- x - c cbind("beta0" = 1, "beta1" = x, "beta2" = pmin(x, 0) ^ 2) } 

Using AIC to select c

Using RSS to select c

grid search to find c

The est function below completes .lm.fit (for maximum efficiency) to evaluate and output the model for a given c :

 ## `x`, `y` give data points; `c` is known break point est <- function (x, y, c) { ## model matrix X <- getX(x, c) p <- dim(X)[2L] ## solve least squares with QR factorization fit <- .lm.fit(X, y) ## compute Pearson estimate of `sigma ^ 2` r <- c(fit$residuals) n <- length(r) RSS <- c(crossprod(r)) sig2 <- RSS / (n - p) ## coefficients summary table beta <- fit$coefficients R <- "dimnames<-"(fit$qr[1:p, ], NULL) Rinv <- backsolve(R, diag(p)) se <- sqrt(rowSums(Rinv ^ 2) * sig2) tstat <- beta / se pval <- 2 * pt(abs(tstat), n - p, lower.tail = FALSE) tab <- matrix(c(beta, se, tstat, pval), nrow = p, ncol = 4L, dimnames = list(dimnames(X)[[2L]], c("Estimate", "Std. Error", "t value", "Pr(>|t|)"))) ## 2 * negative log-likelihood nega2logLik <- n * log(2 * pi * sig2) + (n - p) ## AIC / BIC aic <- nega2logLik + 2 * (p + 1) bic <- nega2logLik + log(n) * (p + 1) ## multiple R-squared and adjusted R-squared TSS <- c(crossprod(y - sum(y) / n)) r.squared <- 1 - RSS / TSS adj.r.squared <- 1 - sig2 * (n - 1) / TSS ## return list(coefficients = beta, residuals = r, fitted.values = c(X %*% beta), R = R, sig2 = sig2, coef.table = tab, aic = aic, bic = bic, c = c, RSS = RSS, r.squared = r.squared, adj.r.squared = adj.r.squared) } 

As you can see, it also returns various summaries, as if summary.lm was called. Now write another choose.c wrapper choose.c . Draw an RSS vs c.grid and return the best model with the selected c .

 choose.c <- function (x, y, c.grid) { if (is.unsorted(c.grid)) stop("'c.grid' in not increasing") ## model list lst <- lapply(c.grid, est, x = x, y = y) ## RSS trace RSS <- sapply(lst, "[[", "RSS") ## verbose plot(c.grid, RSS, type = "b", pch = 19) ## find `c` / the model minimizing `RSS` lst[[which.min(RSS)]] } 

So far so good. To complete the story, we also need the predict procedure.

 pred <- function (model, x.new) { ## prediction matrix X <- getX(x.new, model$c) p <- dim(X)[2L] ## predicted mean fit <- X %*% model$coefficients ## prediction standard error Qt <- forwardsolve(t(model$R), t(X)) se <- sqrt(colSums(Qt ^ 2) * model$sig2) ## 95%-confidence interval alpha <- qt(0.025, length(model$residuals) - p) lwr <- fit + alpha * se upr <- fit - alpha * se ## return matrix(c(fit, se, lwr, upr), ncol = 4L, dimnames = list(NULL, c("fit", "se", "lwr", "upr"))) } 
+6
source

A genius, but I would like to propose another solution using the Heaviside function ( unit step ), H (x) = 1 if x> 0; H = 0 if x ≤ 0

 H <- function(x) as.numeric(x>0) 

Then the suitable function f (x, c) = b0 + b1 (xc) + b2 (xc) ^ 2 H (cx) and can be used with nls:

 fit <- nls(y ~ b0+b1*(xc)+b2*(xc)^2*H(cx), start = list(b0=0,b1=0,b2=1,c=0.5)) 

Testing by example 李哲源 toy gives

 summary(fit)$parameters Estimate Std. Error t value Pr(>|t|) b0 0.1199124 0.03177064 3.774315 2.777969e-04 b1 -0.2578121 0.07856856 -3.281365 1.440945e-03 b2 2.4316379 0.40105205 6.063148 2.624975e-08 c 0.5400831 0.05287111 10.215089 5.136550e-17 
0
source

Source: https://habr.com/ru/post/1259859/


All Articles