Group_by to select the first two lines, then spread ()

I am trying to reformat this so that I can generate a data frame for all instances On Hold Beginsand the next event immediately after it. On Hold Begins- it is the beginning of the event, and I would like to fix it Timestamp, and Deviationas well Timestamp, and Deviationfor the next event immediately after it (ie. Below Thresold, Stage Enabled).

If possible, I want to include only fragments for which as the first event On Hold Begins(thus, the ideal solution would not include rows 1 and 2 described above), I do not want additional X columns to be added, and I would like it to be was formatted as I described, It looks like: How can I distribute repeated measurements of several variables in a wide format? but I ran into errors requesting a dictionary when I tried it.

Thank you all very much for your help.

0
source share
2 answers

A simple solution using the R base:

first_idx <- which(df$Flag == "On Hold Begins")
second_idx <- first_idx + 1
df_1 <- df[first_idx,]; colnames(df_1) <- paste("Flag 1 ", colnames(df_1))
df_2 <- df[second_idx,]; colnames(df_2) <- paste("Flag 2 ", colnames(df_2))
cbind(df_1, df_2)

   Flag 1  Stage   Flag 1  Flag Flag 1  Timestamp Flag 1  x Flag 1  Deviation Flag 2  Stage    Flag 2  Flag Flag 2  Timestamp Flag 2  x Flag 2  Deviation
3              a On Hold Begins     4/29/17 15:34         1             1.200             a Below Threshold     4/29/17 15:35         1            0.0000
6              a On Hold Begins     4/29/17 21:49         5             1.200             a Below Threshold     4/29/17 21:50         5            0.0000
10             a On Hold Begins     4/29/17 23:29         6             1.200             a Below Threshold     4/29/17 23:30         6            0.0000
12             a On Hold Begins     5/16/17 17:22         8             1.774             a   Stage Enabled     5/16/17 17:39         8            1.8973
15             a On Hold Begins     5/16/17 19:14         9             1.095             a Below Threshold     5/16/17 19:15         9           -0.2252
21             b On Hold Begins     4/28/17 22:05       125             1.200             b    On Hold Ends     4/28/17 22:07       125            1.2000
24             b On Hold Begins     4/28/17 23:29       128             1.200             b Below Threshold     4/28/17 23:30       128            0.0000
26             b On Hold Begins      4/29/17 1:53       133             1.200             b Below Threshold      4/29/17 1:55       133            0.0000
29             b On Hold Begins      4/29/17 2:40       135             1.200          <NA>            <NA>              <NA>        NA                NA
+2
source

1) ; 2) , " 1" " 2".

df_tidy <- df %>% 
  slice(-1) %>%             
  mutate(my_serial = case_when(
    str_detect(Flag, "On Hold Begins")~row_number() )) %>% 
    fill(my_serial) %>%     #< Assign serials to related records
  group_by(my_serial) %>% 
  slice(1:2) %>%            #< Take first records in each set
  mutate(flag_number = if_else(
    str_detect(Flag, "On Hold Begins"), "Flag 1", "Flag 2")) #< Tag Records

df_1 <- df_tidy %>% 
  filter(flag_number %in% "Flag 1") %>% 
  select(1:3) %>% 
  setNames(paste0("Flag 1_", names(.)) ) 

df_2 <- df_tidy %>% 
  filter(flag_number %in% "Flag 2") %>% 
  select(1:3) %>% 
  setNames(paste0("Flag 2_", names(.)) ) 

bind_cols(df_1, df_2)
0

Source: https://habr.com/ru/post/1684940/


All Articles