Values โ€‹โ€‹of Encoding Variables in Classes Using R

I have a dataset in which I need to encode the values โ€‹โ€‹of certain variables (numeric) into 3 classes.

My dataset is similar to this, but has 60 more variables:

anim <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15) wt <- c(181,179,180.5,201,201.5,245,246.4,189.3,301,354,369,205,199,394,231.3) data <- data.frame(anim,wt) > data anim wt 1 1 181.0 2 2 179.0 3 3 180.5 4 4 201.0 5 5 201.5 6 6 245.0 7 7 246.4 8 8 189.3 9 9 301.0 10 10 354.0 11 11 369.0 12 12 205.0 13 13 199.0 14 14 394.0 15 15 231.3 

I need to encode the values โ€‹โ€‹of the variable "wt" to 3 classes: (wt> = 179 and wt <200) = 1; (wt> = 200 and wt <300) = 2; (wt> 300) = 3

who should give me this

 > data2 anim wt SWT 1 1 181.0 1 2 2 179.0 1 3 3 180.5 1 4 4 201.0 2 5 5 201.5 2 6 6 245.0 2 7 7 246.4 2 8 8 189.3 1 9 9 301.0 3 10 10 354.0 3 11 11 369.0 3 12 12 205.0 2 13 13 199.0 1 14 14 394.0 3 15 15 231.3 2 
+6
source share
5 answers

The cut method as described in @Greg is probably what you want here. It should be noted that cut returns a default coefficient that you can suppress by providing labels = FALSE to return integer values:

 cut(data$wt, c(178, 200, 300, Inf), labels = FALSE) 

Alternatively, if your cutting does not lend itself to natural interruptions, you can use ifelse() . You can nest ifelse statements similar to Excel. I use c to shorten the required input:

 data$group2 <- with(data, ifelse(wt >= 179 & wt < 200, 1, ifelse(wt >= 200 & wt < 300, 2, 3)) ) 
+10
source

You can try cut

 anim <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15) wt <-c(181,179,180.5,201,201.5,245,246.4, 189.3,301,354,369,205,199,394,231.3) data <- data.frame(anim,wt) 

EDIT: fixed group - right = FALSE, got rid of the split example.

 group = cut(data$wt, c(178, 200, 300, Inf), right=FALSE) data$swt = as.numeric(group) data anim wt swt 1 1 181.0 1 2 2 179.0 1 3 3 180.5 1 4 4 201.0 2 5 5 201.5 2 6 6 245.0 2 7 7 246.4 2 8 8 189.3 1 9 9 301.0 3 10 10 354.0 3 11 11 369.0 3 12 12 205.0 2 13 13 199.0 1 14 14 394.0 3 15 15 231.3 2 > 
+5
source

I think Greg responds to the "standard working procedure", but I find many uses for the findInterval function. It naturally returns a number that identifies the interval in the second argument.

  data$int <- findInterval(data$wt, c(179, 200, 300, Inf)) data 
+2
source

Just to show an alternative method (similar to transcoding in SPSS) from the car package:

 > data$SWT <- with(data, recode(wt, "lo:200=1; 300:hi=3; else=2")) > data anim wt SWT 1 1 181.0 1 2 2 179.0 1 3 3 180.5 1 4 4 201.0 2 5 5 201.5 2 6 6 245.0 2 7 7 246.4 2 8 8 189.3 1 9 9 301.0 3 10 10 354.0 3 11 11 369.0 3 12 12 205.0 2 13 13 199.0 1 14 14 394.0 3 15 15 231.3 2 
+1
source

For completeness and information only, the classInt package (on CRAN) is another convenient way to classify numbers in classes.

0
source

Source: https://habr.com/ru/post/888272/


All Articles