I am trying to create a simple summary function to speed up the reporting of multiple data columns for use in the R Markdown file.
var1 is the categorical data column, t_var is an integer representing a quarter of the data, and dt is the full data.
summarise_data_categorical <- function(var1, t_var, dt){
print(var1)
print(t_var)
group_func <- dt %>%
select(one_of(t_var, var1)) %>%
group_by(t_var,var1)
count_table <- group_func %>%
summarise(count = n()) %>%
spread(t_var, count)
freq <- dt %>%
select(t_var, var1) %>%
group_by(t_var,var1) %>%
summarise(count = n()) %>%
mutate(freq = round(count / sum(count),3)*100) %>%
select(-count)
freq_table <- freq %>%
spread(t_var, freq)
freq_chart <- freq %>%
ggplot()+
geom_line(mapping=aes(x=t_var, y = freq, colour=var1))
results <- list(count_table, freq_table, freq_chart)
results
}
Say I have a frame:
fr <- data.frame(lets = sample(LETTERS, 100, replace=TRUE),
`quarter type` = sample(1:4, 100, replace=TRUE))
If I run the function, then:
summarise_data_categorical("lets", "quarter type", fr)
The primary conclusion is promising:
[1] "lets"
[1] "quarter type"
(NOTE: When I try to recreate the data for any reason, I also get a warning:
Unknown variables: quarter typeAlthough this does not appear in my source data)
The main thing is that I get the error message:
Error in resolve_vars(new_groups, tbl_vars(.data)) : unknown variable to group by : t_var
Coming from Python, I'm still a little confused about how to access columns. Can someone explain how I can fix what I got?