Dplyr sums up the user-function score in two?

I use the dplyr group_by and summarise functions with a custom aggregate function and observe strange behavior. It seems that the cumulative function is evaluated twice for each group.

Here is a minimal example:

 aggFun <- function(x) { print("Inside function"); print(rnorm(1)); sum(x)} df <- data.frame(key = rep("a", 3), val = 1:3) df %>% group_by(key) %>% summarise(sum = aggFun(val)) 

The following is displayed:

 [1] "Inside function" [1] 0.3230769 [1] "Inside function" [1] -0.3347653 # A tibble: 1 Γ— 2 key sum <fctr> <int> 1 a 6 

Since there is only one group, should the function be evaluated only once? Am I experiencing the same thing in a large application and worried that this might be bad for performance, or am I missing something?

Solved by updating to GitHub version .

+5
source share

Source: https://habr.com/ru/post/1264746/


All Articles