From the other answers on this site for similar questions and, for example, from pages like http://www.r-tutor.com/r-introduction/data-frame/data-frame-column-vector , it seems that I I extract the variable from data.frame , data[ , "col"] and data$col to give the same result. But now I have some data in Excel:
LU Urban_LU LU_Index Urban_LU_index Residential Residential 2 0 Rural residential Residential 3 0 Commercial Commercial 4 1 Public institutions including education Industrial 5 1 Industry Industrial 7 2
)
and I read it with read_excel from the readxl package:
library(readxl) data <- read_excel("data.xlsx", "Sheet 1")
Now I am extracting one variable from the data frame using [ or $ :
data[ , "LU"] # Source: local data frame [5 x 1] # # LU # (chr) # 1 Residential # 2 Rural residential # 3 Commercial # 4 Public institutions including education # 5 Industry data$LU # [1] "Residential" "Rural residential" # [3] "Commercial" "Public institutions including education" # [5] "Industry" length(data[ , "LU"]) # [1] 1 length(data$LU) # [1] 5
In addition, data classes obtained from read_excel and data obtained from two different extraction methods are suspicious:
class(data) # [1] "tbl_df" "tbl" "data.frame" class(data[ , "LU"]) # [1] "tbl_df" "data.frame" class(data$LU) # [1] "character" >
So what is the difference between [ , "col"] and $col ? Am I missing something from the manual or is this a special case? Also, what about the class identifiers tbl_df and tbl ? I suspect they are the cause of my confusion, what do they mean?
source share