When using, summariseI encountered unusual behavior.
df <- data.frame(id = c(1, 2, 3, 3, 4),
color = c(NA, "blue", "red", "blue", NA),
stringsAsFactors = FALSE)
df
# id color
# 1 1 <NA>
# 2 2 blue
# 3 3 red
# 4 3 blue
# 5 4 <NA>
First part
Let’s choose the first value colorfor each id:
df %>%
group_by(id) %>%
summarise(result = color[1])
# # A tibble: 4 × 2
# id result
# <dbl> <chr>
# 1 1
# 2 2 blue
# 3 3 red
# 4 4 <NA>
I expected <NA>instead of an empty string. Did I do something wrong? first(color)produces the correct conclusion, but I thought it was color[1]equivalent.
In addition, it color %>% firstproduces the same conclusion as color[1], and this confuses me even more.
The second part of
Enter the following meaningless code:
df%>%
group_by(id) %>%
summarise(color = color[1],
color2 = first(color))
I get segfault. Is this a known bug or should I report it? I found some old SO questions, and the GitHub threads look very similar, but they look resolved.
Note : I use dplyr 0.5.0inR 3.3.3