I found some weird substitution behavior with dplyr tbl_df data frames. When I multiply a data frame with 'matrix' style df[,'a'], it returns a vector, as expected. However, when I do the same when it is a data frame tbl_df, it returns a data frame instead.
I reproduced it below using the Iris dataset.
Can someone explain why this is happening, or how can I de-tbl_df these data frames? I need to use dplyr and readr in the creation process so that it is necessary.
library(dplyr)
data(iris)
str(iris['Sepal.Length'])
'data.frame': 150 obs. of 1 variable:
$ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
str(iris[,'Sepal.Length'])
num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
iris <- tbl_df(iris)
str(iris[,'Sepal.Length'])
Classes ‘tbl_df’ and 'data.frame': 150 obs. of 1 variable:
$ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
source
share