I'm new to R, and this is the first time I dare to ask a question here.
I work with a dataset with scales for comparison, and I want to sum the sum over different groups of columns that separate the first rows in their name.
Below, I built a data frame from only two lines to illustrate the approach that I followed, although I would like to receive feedback on how I can write a more efficient way to do this.
df <- as.data.frame(rbind(rep(sample(1:5),4),rep(sample(1:5),4))) var.names <- c("emp_1","emp_2","emp_3","emp_4","sat_1","sat_2" ,"sat_3","res_1","res_2","res_3","res_4","com_1", "com_2","com_3","com_4","com_5","cap_1","cap_2", "cap_3","cap_4") names(df) <- var.names
So, I did to use the grep function to be able to sum the lines of specified variables that started with specific lines and store them in a new variable. But I have to write a new line of code for each variable.
df$emp_t <- rowSums(df[, grep("\\bemp.", names(df))]) df$sat_t <- rowSums(df[, grep("\\bsat.", names(df))]) df$res_t <- rowSums(df[, grep("\\bres.", names(df))]) df$com_t <- rowSums(df[, grep("\\bcom.", names(df))]) df$cap_t <- rowSums(df[, grep("\\bcap.", names(df))])
But there are a lot more variables in the dataset, and I would like to know if there is a way to do this with just one line of code. For example, somehow group the variables that start from the same lines together, and then apply the row function.
Thanks in advance!