Lm () called inside mutant ()

I wonder if lm () can be used in mutate () of the dplyr package. I currently have a dataframe "date", "company", "return" and "market.ret", playable as shown below:

library(dplyr)
n.dates <- 60
n.stocks <- 2
date <- seq(as.Date("2011-07-01"), by=1, len=n.dates)
symbol <- replicate(n.stocks, paste0(sample(LETTERS, 5), collapse = ""))
x <- expand.grid(date, symbol)
x$return <- rnorm(n.dates*n.stocks, 0, sd = 0.05)
names(x) <- c("date", "company", "return")
x <- group_by(x, date)    
x <- mutate(x, market.ret = mean(x$return, na.rm = TRUE))

Now for each company I would like to put "return" on "market.ret", calculate the linear regression coefficient and keep the slopes in a new column. I want to do this with mutate (), but the code below does not work:

x <- group_by(x, company)
x <- mutate(x, beta = coef(lm(x$return~x$market.ret))[[2]])

Error reported by R:

Error in terms.formula(formula, data = data) : 
invalid term in model formula

Thanks in advance for any suggestion!

+4
source share
2 answers

This seems to work for me:

group_by(x, company) %>%
    do(data.frame(beta = coef(lm(return ~ market.ret,data = .))[2])) %>%
    left_join(x,.)
+6
source

, , . , data.table; , .

library(data.table) ## 1.9.2+
setDT(x)[ , market.ret := mean(return), by = date]
x[, beta := coef(lm(return ~ market.ret, data = .SD))[[2]], by = company]

x, ( set.seed ):

set.seed(1L)     # for reproducible example
n.dates <- 60
n.stocks <- 2
date <- seq(as.Date("2011-07-01"), by=1, len=n.dates)
symbol <- replicate(n.stocks, paste0(sample(LETTERS, 5), collapse = ""))
x <- expand.grid(date, symbol)
x$return <- rnorm(n.dates*n.stocks, 0, sd = 0.05)
names(x) <- c("date", "company", "return")
+2

Source: https://habr.com/ru/post/1548543/


All Articles