Why does a subset change from tbl_df to dlpyr?

I found some weird substitution behavior with dplyr tbl_df data frames. When I multiply a data frame with 'matrix' style df[,'a'], it returns a vector, as expected. However, when I do the same when it is a data frame tbl_df, it returns a data frame instead.

I reproduced it below using the Iris dataset.

Can someone explain why this is happening, or how can I de-tbl_df these data frames? I need to use dplyr and readr in the creation process so that it is necessary.

library(dplyr)
data(iris)

str(iris['Sepal.Length'])
'data.frame':   150 obs. of  1 variable:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...

str(iris[,'Sepal.Length'])
 num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...

iris <- tbl_df(iris)

str(iris[,'Sepal.Length'])
Classes ‘tbl_df’ and 'data.frame':  150 obs. of  1 variable:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
+4
source share
1 answer

This is special.

See ?tbl_df:

Methods

‘tbl_df’ implements two important basic methods:

print 10 ,           

‘[’ (), data.frame

( )

class(tbl_df(iris)), , "tbl_df", "tbl" , , "data.frame", [, methods(class='tbl_df') [.tbl_df.

( , datatables data.table [).


edit: to un- tbl_df, data.frame, . data.frame(tbl_df(iris)) tbl_df(..) data.frame.

+4

Source: https://habr.com/ru/post/1598135/


All Articles