Unexpected behavior when extracting factor levels

Can someone explain why levels () shows three levels of factors, while you can see that a vector has only two?

> str(walk.df)
'data.frame':   10 obs. of  4 variables:
 $ walker : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 2 2 2 2 2

> walk.df$walker
 [1] 1 1 1 1 1 2 2 2 2 2
Levels: 1 2 3

I would like to extract a vector of levels, and I thought it was the right way, but, as you can see, there are three sneaks out there that messed up my function.

> as.numeric(levels(walk.df$walker))
[1] 1 2 3
+3
source share
2 answers

Probably walk.df is a subset of a factor variable with 3 levels. let's say

a<-factor(1:3)
b<-a[1:2]

then b has 3 levels.

Easy way to reset an extra level:

b<-a[1:2, drop=T]

or if you cannot access the source variable,

b<-factor(b)
+9
source

You can assign several factors to a factor that contains two levels:

 > set.seed(1234)
 > x <- round(runif(10, 1, 2))
 > x
  [1] 1 2 2 2 2 2 1 1 2 2
 > y <- factor(x)
 > levels(y)
 [1] "1" "2"
 > levels(y) <- c("1", "2", "3")
 > y
  [1] 1 2 2 2 2 2 1 1 2 2
 Levels: 1 2 3

or even without levels:

 > p <- NA
 > q <- factor(p)
 > levels(q)
 character(0)
 > levels(q) <- c("1", "2", "3")
 > q
 [1] <NA>
 Levels: 1 2 3
0
source

Source: https://habr.com/ru/post/1743783/


All Articles