Prevent NA use in lm regression

I have a vector Y containing future returns, and a vector X containing current values. The last element of Y is NA, since the last current return is also the last available row.

X = { 0.1, 0.3, 0.2, 0.5 } Y = { 0.3, 0.2, 0.5, NA } Other = { 5500, 222, 523, 3677 } lm(Y ~ X + Other) 

I want to make sure that the last element of each series is not included in the regression. I read the na.action documentation, but I don't understand if this is the default behavior.

In the case of cor (), is this the right solution to exclude X [4] and Y [4] from the calculation?

 cor(X, Y, use = "pairwise.complete.obs") 
+6
source share
1 answer

The factory -fresh default value for lm is to ignore cases containing NA values. Since this can be overridden with global parameters, you can explicitly set na.action to na.omit :

 > summary(lm(Y ~ X + Other, na.action=na.omit)) Call: lm(formula = Y ~ X + Other, na.action = na.omit) [snip] (1 observation deleted due to missingness) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 

As for your second cor(X,Y,use='pairwise.complete.obs') question cor(X,Y,use='pairwise.complete.obs') , this is correct. Since there are only two variables, cor(X,Y,use='complete.obs') will also produce the expected result.

+8
source

Source: https://habr.com/ru/post/903456/


All Articles