Group_by to select the first two lines, then spread ()

Question

Group_by to select the first two lines, then spread ()

I am trying to reformat this so that I can generate a data frame for all instances On Hold Beginsand the next event immediately after it. On Hold Begins- it is the beginning of the event, and I would like to fix it Timestamp, and Deviationas well Timestamp, and Deviationfor the next event immediately after it (ie. Below Thresold, Stage Enabled).

If possible, I want to include only fragments for which as the first event On Hold Begins(thus, the ideal solution would not include rows 1 and 2 described above), I do not want additional X columns to be added, and I would like it to be was formatted as I described, It looks like: How can I distribute repeated measurements of several variables in a wide format? but I ran into errors requesting a dictionary when I tried it.

Thank you all very much for your help.

0

r slice dplyr tidyr

longlivebrew Feb 13 '18 at 19:56

source share

2 answers

1) ; 2) , " 1" " 2".

df_tidy <- df %>% 
  slice(-1) %>%             
  mutate(my_serial = case_when(
    str_detect(Flag, "On Hold Begins")~row_number() )) %>% 
    fill(my_serial) %>%     #< Assign serials to related records
  group_by(my_serial) %>% 
  slice(1:2) %>%            #< Take first records in each set
  mutate(flag_number = if_else(
    str_detect(Flag, "On Hold Begins"), "Flag 1", "Flag 2")) #< Tag Records

df_1 <- df_tidy %>% 
  filter(flag_number %in% "Flag 1") %>% 
  select(1:3) %>% 
  setNames(paste0("Flag 1_", names(.)) ) 

df_2 <- df_tidy %>% 
  filter(flag_number %in% "Flag 2") %>% 
  select(1:3) %>% 
  setNames(paste0("Flag 2_", names(.)) ) 

bind_cols(df_1, df_2)

0

Nettle 14 . '18 5:09

thc · Accepted Answer · 2018-02-13T20:21:33+0000

A simple solution using the R base:

first_idx <- which(df$Flag == "On Hold Begins")
second_idx <- first_idx + 1
df_1 <- df[first_idx,]; colnames(df_1) <- paste("Flag 1 ", colnames(df_1))
df_2 <- df[second_idx,]; colnames(df_2) <- paste("Flag 2 ", colnames(df_2))
cbind(df_1, df_2)

   Flag 1  Stage   Flag 1  Flag Flag 1  Timestamp Flag 1  x Flag 1  Deviation Flag 2  Stage    Flag 2  Flag Flag 2  Timestamp Flag 2  x Flag 2  Deviation
3              a On Hold Begins     4/29/17 15:34         1             1.200             a Below Threshold     4/29/17 15:35         1            0.0000
6              a On Hold Begins     4/29/17 21:49         5             1.200             a Below Threshold     4/29/17 21:50         5            0.0000
10             a On Hold Begins     4/29/17 23:29         6             1.200             a Below Threshold     4/29/17 23:30         6            0.0000
12             a On Hold Begins     5/16/17 17:22         8             1.774             a   Stage Enabled     5/16/17 17:39         8            1.8973
15             a On Hold Begins     5/16/17 19:14         9             1.095             a Below Threshold     5/16/17 19:15         9           -0.2252
21             b On Hold Begins     4/28/17 22:05       125             1.200             b    On Hold Ends     4/28/17 22:07       125            1.2000
24             b On Hold Begins     4/28/17 23:29       128             1.200             b Below Threshold     4/28/17 23:30       128            0.0000
26             b On Hold Begins      4/29/17 1:53       133             1.200             b Below Threshold      4/29/17 1:55       133            0.0000
29             b On Hold Begins      4/29/17 2:40       135             1.200          <NA>            <NA>              <NA>        NA                NA

Group_by to select the first two lines, then spread ()

More articles: