Is there a way to use summarise_each() to count the number of records in a data frame, but ignore NA s?
Example / Data Examples
df_sample <- structure(list(var_1 = c(NA, NA, NA, NA, 1, NA), var_2 = c(NA, NA, NA, NA, 2, 1), var_3 = c(NA, NA, NA, NA, 3, 2), var_4 = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), var_5 = c(NA, NA, NA, NA, 4, 3)), .Names = c("var_1", "var_2", "var_3", "var_4", "var_5"), row.names = 5:10, class = "data.frame") > df_samp var_1 var_2 var_3 var_4 var_5 5 NA NA NA NA NA 6 NA NA NA NA NA 7 NA NA NA NA NA 8 NA NA NA NA NA 9 1 2 3 NA 4 10 NA 1 2 NA 3
Using summarise_each() and n() counts all entries:
library(dplyr) df_samp %>% summarise_each(funs(n()))
I know that n() does not accept arguments, so there is another method that I can use in summarise_each() that will ignore NA when counting the number of records and return zero if the variable is all NA ?
Desired Result
var_1 var_2 var_3 var_4 var_5 1 1 2 2 0 2
The following method gives me part of the path there, but I would also like to return 0 for var_4 :
df_samp %>% melt %>% filter(!is.na(value)) %>% group_by(variable) %>% summarise(records = n()) ## result: variable records 1 var_1 1 2 var_2 2 3 var_3 2 4 var_5 2