Linear interpolation (lm) in R, strange behavior

Using R 3.2.2, I discovered strange behavior that performed simple linear interpolation. The first data frame gives the correct result:

test<-data.frame(dt=c(36996616, 36996620, 36996623, 36996626), value=c(1,2,3,4))
lm(value~dt, test)$coefficients

  (Intercept)            dt 
-1.114966e+07  3.013699e-01 

By increasing the variable dt, the coefficient is now equal to NA:

test$dt<-test$dt+1
lm(value~dt, test)$coefficients

(Intercept)          dt 
        2.5          NA 

Any idea why? It seems that there is an overflow?

Thanks!

+4
source share
2 answers

Edit

I found more details about this issue.

You can get the odds NAif the predictors are fully correlated. This seems like an unusual case, since we have only one predictor. Thus, in this case dtit turns out to be linearly related to interception.

, alias. . https://stats.stackexchange.com/questions/112442/what-are-aliased-coefficients

test<-data.frame(dt=c(36996616, 36996620, 36996623, 36996626), value=c(1,2,3,4))
fit1 <- lm(value ~ dt, test)
alias(fit1)
Model :
value ~ dt

.

test$dt <- test$dt + 1
fit2 <- lm(value ~ dt, test)
alias(fit2)
Model :
value ~ dt

Complete :
   [,1]       
dt 147986489/4

, dt intercept.

, lm : https://stat.ethz.ch/pipermail/r-help/2002-February/018512.html.

lm X'X https://stat.ethz.ch/pipermail/r-help/2008-January/152456.html, , X "X.

x <- matrix(c(rep(1, 4), test$dt), ncol=2)
y <- test$value

b <- solve(t(x) %*% x) %*% t(x) %*% y
Error in solve.default(t(x) %*% x) : 
system is computationally singular: reciprocal condition number = 7.35654e-30

tol lm.fit 1e-7, qr.

qr(t(x) %*% x)$rank
[1] 1

, dt.

# decrease tol in qr
qr(t(x) %*% x, tol = 1e-31)$rank
[1] 2

# and in lm
lm(value~dt, test, tol=1e-31)$coefficients
  (Intercept)            dt 
-1.114966e+07  3.013699e-01 

. https://stats.stackexchange.com/questions/86001/simple-linear-regression-fit-manually-via-matrix-equations-does-not-match-lm-o.

+5

biglm from biglm, , :

library(biglm)
test <- data.frame(dt=c(36996616, 36996620, 36996623, 36996626), 
                   value=c(1,2,3,4))
test$dt <- test$dt+1

coefficients(biglm(value ~ dt, test))
#   (Intercept)            dt 
# -1.114966e+07  3.013699e-01 
0

Source: https://habr.com/ru/post/1608752/