Equivalent to SAS Array in R

I have a dataset with the following columns:

    ID  Measure1    Measure2    XO  X1  x2  x3  x4  x5
    1   30  2   item1   item1   item23  NA      item6   item9
    2   23  2   item1   item323 item1   item4   item5   NA      
    3   2   2   item1   item78  item3   NA      item1   item5

and I want to create a flag variable with this SAS code snippet in R:

 data dt2;
 set dt1;
 array x {5} x1 - x5;
 do i=1 to 5;
 if x0=x{i} then do; 
 flag=i;
 leave;
 end;
 end;
 drop i;
 run;

The goal is to be able to look at the values ​​of x1-x5 and see where xo is equal to any of them and return the position, for example, if item1 is found in x1, then return me the value 1, if it is found at position x3 returns 3.

The final product will look something like this:

    ID  Measure1    Measure2    XO  X1  x2  x3  x4  x5  Flag
    1   30  2   item1   item1   item23  NA          item6   item9   1
    2   23  2   item1   item323 item1   item4       item5   NA      2
    3   2   2   item1   item78  item3   NA          item1   item5   4

Keep in mind that there may be times when all lines of rom x1-x5 contain NA, in which case I would like to return empty, is this possible?

I could not find in R something equivalent in terms of dynamism (without writing a few if or case statements when with sqldf), because now the columns can be 5, but in the future they can change to 20.

Any ideas?

+4
2

max.col

df1$Flag <- max.col(df1$XO[row(df1[-1])]==df1[-1], 'first')
df1
#    XO      X1     x2     x3    x4    x5 Flag
#1 item1   item1 item23  item5 item6 item9    1
#2 item1 item323  item1  item4 item5 itm87    2
#3 item1  item78  item3 item98 item1 item5    4

Update

NA FALSE, max.col. TRUE, NA, rowSums, , 0, 0 NA (NA^..) max.col(..

df3 <- df2[5:ncol(df2)]
i1 <- df2$XO[row(df3)]==df3
i2 <- replace(i1, is.na(i1), FALSE)
df2$Flag <- max.col(i2, 'first') * NA^(rowSums(i2)==0)
df2
#  ID Measure1 Measure2    XO      X1     x2    x3    x4    x5 Flag
#1  1       30        2 item1   item1 item23  <NA> item6 item9    1
#2  2       23        2 item1 item323  item1 item4 item5  <NA>    2
#3  3        2        2 item1  item78  item3  <NA> item1 item5    4
+4

1) R as.matrix(DF[5:9]) == XO , DF[5:9]. wm . wm which.max, , NA , TRUE, .. NA FALSE. , which.max wm, . TRUE, .

wm <- function(x) if (isTRUE(any(x))) which.max(x) else NA
transform(DF, Flag = apply(as.matrix(DF[-(1:4)]) == XO, 1, wm))

:

  ID Measure1 Measure2    XO      x1     x2    x3    x4    x5 Flag
1  1       30        2 item1   item1 item23  <NA> item6 item9    1
2  2       23        2 item1 item323  item1 item4 item5  <NA>    2
3  3        2        2 item1  item78  item3  <NA> item1 item5    4

2) dplyr/tidyr , , x1,..., xn . , tidyr gather , , XO :

library(dplyr)
library(tidyr)
DF %>% 
   left_join(DF %>% gather(Flag, item, -(1:4)) %>% filter(item == XO)) %>%
   select(-item) %>%
   mutate(Flag = match(Flag, names(DF)[-(1:4)]))

:

  ID Measure1 Measure2    XO      x1     x2    x3    x4    x5 Flag
1  1       30        2 item1   item1 item23  <NA> item6 item9    1
2  2       23        2 item1 item323  item1 item4 item5  <NA>    2
3  3        2        2 item1  item78  item3  <NA> item1 item5    4

3) . dplyr/tidyr. reshape , Flag :

r <- reshape(DF, list(names(DF)[-(1:4)]), "X", "Flag", direction = "long")
s <- subset(r, X == XO)[c("ID", "Flag")]
merge(DF, s, all.x = TRUE)

:

  ID Measure1 Measure2    XO      x1     x2    x3    x4    x5 Flag
1  1       30        2 item1   item1 item23  <NA> item6 item9    1
2  2       23        2 item1 item323  item1 item4 item5  <NA>    2
3  3        2        2 item1  item78  item3  <NA> item1 item5    4

.. , . , , .

Lines <- "  ID  Measure1    Measure2    XO  x1  x2  x3  x4  x5
    1   30  2   item1   item1   item23  NA      item6   item9
    2   23  2   item1   item323 item1   item4   item5   NA      
    3   2   2   item1   item78  item3   NA      item1   item5"

DF <- read.table(text = Lines, header = TRUE, as.is = TRUE)
+1
source

Source: https://habr.com/ru/post/1611753/


All Articles