How to count rows in nested data_frames with dplyr

The mute dataframe example is used here:

df <- data_frame(A = c(rep(1, 5), rep(2, 4)), B = 1:9) %>% 
  group_by(A) %>% 
  nest()

which is as follows:

> df
# A tibble: 2 × 2
      A             data
  <dbl>           <list>
1     1 <tibble [5 × 1]>
2     2 <tibble [4 × 1]>

I would like to add a third column with a name Nwith elements equal to the number of rows in each nested data_frame in data. I decided that this would work:

> df %>% 
+   mutate(N = nrow(data))
Error: Unsupported type NILSXP for column "N"

What will go wrong?

+4
source share
3 answers

Combining dplyrand purrr, you can do:

library(tidyverse)

df %>% 
  mutate(n = map_dbl(data, nrow))
#> # A tibble: 2 × 3
#>       A             data     n
#>   <dbl>           <list> <dbl>
#> 1     1 <tibble [5 × 1]>     5
#> 2     2 <tibble [4 × 1]>     4

I like this approach because you stay within your normal workflow, creating a new column inside the mutant, but using the map_*-family, since you need to work with the list.

+2
source

:

df %>%
  rowwise() %>%
  mutate(N = nrow(data))

:

#Source: local data frame [2 x 3]
#Groups: <by row>
#
## A tibble: 2 × 3
#      A             data     N
#  <dbl>           <list> <int>
#1     1 <tibble [5 × 1]>     5
#2     2 <tibble [4 × 1]>     4
+1

dplyr:

df %>% 
  group_by(A) %>%
  mutate(N = nrow(data.frame(data)))
      A             data     N
  <dbl>           <list> <int>
1     1 <tibble [5 × 1]>     5
2     2 <tibble [4 × 1]>     4
0
source

Source: https://habr.com/ru/post/1676393/


All Articles