Calculate the legend of the number in the following values ​​in R

The data frame contains an identifier, an estimate, and several binary variables (0,1)

ID <- c(1,2,3,4,5,6,7,8,9,10)
grade <- c("a", "b", "e", "a", "d", "d", "a", "c", "c", "b")
b1 <- c(1,0,0,0,0,0,0,0,0,0)
b2 <- c(1,1,0,0,0,1,0,1,0,0)
b3 <- c(1,0,0,1,1,0,0,1,0,0)
b4 <- c(1,1,0,0,0,1,0,1,0,0)
b5 <- c(1,1,1,1,1,1,0,1,1,0)
b6 <- c(1,1,1,1,1,1,1,1,1,0)
df <- data.frame(ID, grade, b1, b2, b3, b4, b5, b6)

I need to create a new integer column (name it y) that has values ​​from 1 to 6

Their way to calculating y is to return the position of the first 1 in (from b1 to b6), in which after that the values ​​in the row are the only ones.

For instance:

for ID=1, y=1
  ID=2, y=4
  ID=3, y=5

However, if all values ​​are zeros in b1-b6, then return no.

Also, the faster the code, the better.

+4
source share
2 answers

, . negative lookaround .

Rich Scriven paste0.

stringr:

flag1 <- do.call("paste0",df[,3:8])
df$flag1 <- flag1

library(stringr)
df$flag2 <- str_locate(flag1,"(?!=0)1{1,}$")[,"start"]
df <- data.frame(df)
df[is.na(df$flag2),"flag2"] <- 0

flag2:

ID grade b1 b2 b3 b4 b5 b6  flag1 flag2
1   1     a  1  1  1  1  1  1 111111     1
2   2     b  0  1  0  1  1  1 010111     4
3   3     e  0  0  0  0  1  1 000011     5
4   4     a  0  0  1  0  1  1 001011     5
5   5     d  0  0  1  0  1  1 001011     5
6   6     d  0  1  0  1  1  1 010111     4
7   7     a  0  0  0  0  0  1 000001     6
8   8     c  0  1  1  1  1  1 011111     2
9   9     c  0  0  0  0  1  1 000011     5
10 10     b  0  0  0  0  0  0 000000     0
+2

"df" "b *" 0:

cols = paste("b", 1:6, sep = "")

y = integer(nrow(df))
for(j in seq_along(cols)) y[!df[[cols[j]]]] = j

y
#[1] 0 3 4 4 4 3 5 1 4 6

1 :

y = y + 1L
y[y > length(cols)] = 0L

y
#[1] 1 4 5 5 5 4 6 2 5 0
+1

Source: https://habr.com/ru/post/1674382/


All Articles