Linear regression in R with a variable number of explanatory variables

Possible duplicate:
Defining a formula in R with glm without explicitly declaring each covariance
how to write a formula with many variables from a data frame?

I have a vector of Y values ​​and a matrix of X values ​​that I want to perform multiple regression on (i.e. Y = X [column 1] + X [column 2] + ... X [column N])

The problem is that the number of columns in my matrix (N) is not set. I know in R, in order to perform linear regression, you must specify the equation:

fit = lm(Y~X[,1]+X[,2]+X[,3]) 

But how to do this if I do not know how many columns are in my matrix X?

Thanks!

+4
source share
1 answer

Three ways to increase flexibility.

Method 1

Run a regression using the formula notation:

 fit <- lm( Y ~ . , data=dat ) 

Method 2

Put all your data in one data.frame file, not two:

 dat <- cbind(data.frame(Y=Y),as.data.frame(X)) 

Then run the regression using the formula notation:

 fit <- lm( Y~. , data=dat ) 

Method 3

Another way is to build the formula yourself:

 model1.form.text <- paste("Y ~",paste(xvars,collapse=" + "),collapse=" ") model1.form <- as.formula( model1.form.text ) model1 <- lm( model1.form, data=dat ) 

In this example, xvars is a character vector containing the names of the variables you want to use.

+15
source

Source: https://habr.com/ru/post/1381554/


All Articles