I would like to use summarise()from dplyr after grouping the data to compute a new variable. But I would like it to use one equation for some data and a second equation for the rest of the data.
I tried using group_by()and summarise()with if_else(), but it does not work.
Here is an example. Say - for some reason - I wanted to find a special meaning for the length of the sepal. For the "setosa" species, this particular value is twice the average length of the sepal. For all other species, this is simply the average of the length of the sepal. This is the code I tried, but it does not work withsummarise()
library(dplyr)
iris %>%
group_by(Species) %>%
summarise(sepal_special = if_else(Species == "setosa", mean(Sepal.Length)*2, mean(Sepal.Length)))
This idea works with mutate(), but I will need to reformat it as the dataset I'm looking for.
library(dplyr)
iris %>%
group_by(Species) %>%
mutate(sepal_special = if_else(Species == "setosa", mean(Sepal.Length)*2, mean(Sepal.Length)))
, :
library(dplyr)
iris %>%
group_by(Species)%>%
summarise(sepal_mean = mean(Sepal.Length))
# A tibble: 3 x 2
# Species sepal_special
# <fctr> <dbl>
#1 setosa 5.01
#2 versicolor 5.94
#3 virginica 6.59
#>
setosa x 2
# A tibble: 3 x 2
# Species sepal_special
# <fctr> <dbl>
#1 setosa **10.02**
#2 versicolor 5.94
#3 virginica 6.59
#>
? , if_else() summarise(), -, , .
!