Attempting to select a column of an object of class grouped_df by index gives "Error: the index is out of bounds." for instance
x <- mtcars %>% group_by(am, gear) %>% summarise_each(funs(sum), disp, hp, drat) class(x) # "grouped_df" "tbl_df" "tbl" "data.frame" # For some reason the first column can be selected... x[1] # Source: local data frame [4 x 1]
It uses R version 3.1.1 and dplyr 0.3.0.2. I am not sure if this is a mistake or intentional. Is there a good reason why it works that way? I would rather remember to ungroup my data frames after using dplyr every time ...
Update. Looking a little further at this, I assume that the motivation for defining [.grouped_df in this way is to keep the groups when called, for example x[1:3] (which works). However, when the index is not part of the grouping variables, the above error is thrown. Perhaps it can be changed so that in this case it [.tbl_df and at the same time [.tbl_df warning ...
Update 2 [.grouped_df was changed in dplyr development version (0.3.0.9000). It still raises an error, but is now clearer by indicating which grouping variables were not included.
x[2]
The best solution I have found that my code does not crash in this situation is to include %>% ungroup at the end of the dplyr command dplyr .
source share