I work with a dataset in R that comes with a codebook that basically tells me what labels should be for different levels of my variable factors. For example, using the codebook, I see that in my variable βSexβ 0 is βWomenβ and β1β is βMaleβ. I use this information to appropriately label values ββin my variables.
However, to my regret, I recently discovered that the codebook is not complete. For example, he tells me one variable that 1s is βYesβ and 2 is βNo,β but he does not tell me what 7s, 8s and 9s are that I see in the data. What I would like to do is label this variable as follows (or something like this):
data$variable <- factor(data$variable, levels=c(1, 2, 7, 8, 9), labels=c("Yes", "No", "7", "8", "9"))
Basically, I would like that for all levels that were not specified in the codebook, they should be marked as themselves. The problem I am facing is that several of them are missing from this codebook, and I really would not have to manually look at all the undefined values ββin my data in order to build the code above for the slave variable. Also, if I just leave these missing levels, R automatically calls them βNA,β which I don't want.
Summary. I am trying to figure out how to use factor () in such a way that instead of marking all unspecified levels as "NA", he calls them as himself.
source share