An elegant way to get data.frame call classes

I am currently using the following function to list data.frame classes:

sapply(names(iris),function(x) class(iris[,x])) 

There must be a more elegant way to do this ...

+4
source share
2 answers

Since data.frames are already lists, sapply(iris, class) will work. sapply cannot simplify the vector for classes extending other classes, so you can do something to take the first class, insert classes together, etc.

+9
source

EDIT If you just want to LOOK in classes, consider using str :

 str(iris) # Show "summary" of data.frame or any other object #'data.frame':  150 obs. of  5 variables: # $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ... # $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ... # $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ... # $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ... # $ Species   : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ... 

But to extend @JoshuaUlrish's excellent answer, data.frame with time or ordered factor columns will cause pain with the sapply solution:

 d <- data.frame(ID=1, time=Sys.time(), factor=ordered(42)) # This doesn't return a character vector anymore sapply(d, class) #$ID #[1] "numeric" # #$time #[1] "POSIXct" "POSIXt" # #$factor #[1] "ordered" "factor" # Alternative 1: Get the first class sapply(d, function(x) class(x)[[1]]) # ID time factor #"numeric" "POSIXct" "ordered" # Alternative 2: Paste classes together sapply(d, function(x) paste(class(x), collapse='/')) # ID time factor # "numeric" "POSIXct/POSIXt" "ordered/factor" 

Please note that none of these solutions is perfect. Getting only the first (or last) class can return something completely meaningless. Pasting makes a complex class more difficult. Sometimes you may just want to detect when this happens, so a mistake would be preferred (and I like vapply ;-) :

 # Alternative 3: Fail if there are multiple-class columns vapply(d, class, character(1)) #Error in vapply(d, class, character(1)) : values must be length 1, # but FUN(X[[2]]) result is length 2 
+3
source

Source: https://habr.com/ru/post/1381213/


All Articles