R and the resulting coefficient names

Question

R and the resulting coefficient names

In the following example, let's say you have a model where supp is a factor variable.

 lm(len ~ dose + supp, data = ToothGrowth)

but I want to use a different base level for the factor. I could indicate this directly in the formula:

 lm(len ~ dose + relevel(supp, "VC"), data = ToothGrowth)

and the output will be:

 Call: lm(formula = len ~ dose + relevel(supp, "VC"), data = ToothGrowth) Coefficients: (Intercept) dose relevel(supp, "VC")OJ 5.573 9.764 3.700

It is very convenient to make transformations directly in the formula, and not create intermediate data sets or modify existing ones. For example, when you use scale to standardize variables, where it is important to consider omissions in other variables included in the final model. Often, however, the resulting names of the output coefficients become quite ugly.

My question is: is it possible to specify the name of the variable that arises from the expression when working with the formula? Sort of

 lm(len ~ dose + (OJ = relevel(supp, "VC")), data = Toothgrowth)

(which does not work).

EDIT: Although the solution proposed by G. Grothendieck is nice, it actually generates the wrong result. The following example shows this:

 # Create some data: df <- data.frame(x1 = runif(10), x2=runif(10)) df <- transform(df, y = x1 + x2 + rnorm(10)) # Introduce some missings. df$x1[2:3] <- NA # The wrong result: lm(formula = y ~ z1 + z2, data = transform(df, z1 = scale(x1), z2=scale(x2))) # extract a model frame. df2 <- model.frame(y ~ x1 + x2, df) # The right result: lm(formula = y ~ scale(x1) + scale(x2), data = df2) # or: lm(formula = y ~ z1 + z2, data = transform(model.frame(y ~ x1 + x2, df), z1 = scale(x1), z2 = scale(x2)))

The problem is that when unifying x2, it uses observations that are not included in the final model, since x1 has gaps.

So, the question for me remains: is there a way for the formula interface to handle this case, without having the annoying intermediate step of using an additional formula and extracting a model frame, which can then be "transformed".

I hope the question is clear.

+6

r formula

Stefan Mar 6 '12 at 14:45

source share

1 answer

G. grothendieck · Accepted Answer · 2012-03-06T15:03:42+0000

Change it in the data= argument, not in the formula= argument:

 lm(len ~ dose + OJ, data = transform(ToothGrowth, OJ = relevel(supp, "VC")))

R and the resulting coefficient names

More articles: