I was interested in a debugging question (the desire to save intermediate results so that I could view and manipulate them from the console without breaking the pipeline into two parts, which is cumbersome. Therefore, for my purposes, the only problem with the OP solution was that the original solution was that it was slightly verbose.
This can be fixed by defining a helper function:
to_var <- function(., ..., env=.GlobalEnv) { var_name = quo_name(quos(...)[[1]]) assign(var_name, ., envir=env) . }
Which can then be used as follows:
df <- data.frame(a = LETTERS[1:3], b=1:3) df %>% filter(b < 3) %>% to_var(tmp) %>% mutate(b = b*2) %>% bind_rows(tmp) # tmp still exists here
This still uses the global environment, but you can also explicitly pass in a more local environment, as in the following example:
f <- function() { df <- data.frame(a = LETTERS[1:3], b=1:3) env = environment() df %>% filter(b < 3) %>% to_var(tmp, env=env) %>% mutate(b = b*2) %>% bind_rows(tmp) } f()
The problem with the decision made is that it does not work out of the box with tubes connecting the threads. G. Grothendieck's solution does not work at all for the debugging option. (update: see J. Grothendieck's comment below and his updated answer!)
Finally, the reason assign("tmp",.) %>%
does not work because the default envir argument for assign()
is the "current environment" (see the documentation for assign ), which differs at each stage of the pipeline. To see this, try pasting { print(environment());. } %>%
{ print(environment());. } %>%
{ print(environment());. } %>%
{ print(environment());. } %>%
{ print(environment());. } %>%
{ print(environment());. } %>%
to the pipeline at different points and we see that each time a different address is printed. (You can probably change the definition of to_var to use the grandfather environment by default instead.)