R: Find the first value for each day.

I have a data.frame with datetimes and values ​​(between 0 and 1), and I would like to find the first occurrence of value = 1 per day .

df <- read.table(header = TRUE, text = '
Datetime                   Value
"2016-12-01 23:45:00"      0
"2016-12-01 23:50:00"      1
"2016-12-02 00:05:00"      1
"2016-12-02 00:10:00"      0
"2016-12-03 04:10:00"      0
"2016-12-03 04:15:00"      0
"2016-12-04 12:10:00"      1
"2016-12-04 12:15:00"      1
')
df$Datetime <- as.POSIXct(df$Datetime, "%Y-%m-%d %H:%M:%S", tz="UTC")
View(df)

I would like:

2016-12-01 23:50:00      1
2016-12-02 00:05:00      1
2016-12-04 12:10:00      1

I tried to solve the problem with match () and aggregate (), but so far no luck. Also, I was able to solve the for loop problem, but it was a) very slow and b) probably not the way it should be.

+4
source share
4 answers
df[!duplicated(paste0(as.Date(df$Datetime), df$Value)) & df$Value == 1, ]
#              Datetime Value
# 2 2016-12-01 23:50:00     1
# 3 2016-12-02 00:05:00     1
# 7 2016-12-04 12:10:00     1

Explanation:

(as.Date) - paste0. , , , (!) (duplicated), , "" 1 (& df$Value == 1).

+2

Value==1. , . , == 1.

Ones = df[df$Value == 1,]
DayChange = c(1, which(diff(as.Date(Ones$Datetime)) > 0)+1)
Ones[DayChange,]
             Datetime Value
2 2016-12-01 23:50:00     1
3 2016-12-02 00:05:00     1
7 2016-12-04 12:10:00     1
+4

dplyr:

library(dplyr)
df %>%
 #group
 group_by(as.Date(Datetime)) %>%
 #select only those where value equals 1
 filter(Value == 1) %>%
 #get only the first row
 slice(1) %>%
 #ungroup
 ungroup %>%
 #select columns
 select(Datetime, Value)

Ouput:

# A tibble: 3 x 2
             Datetime Value
               <time> <int>
1 2016-12-01 23:50:00     1
2 2016-12-02 00:05:00     1
3 2016-12-04 12:10:00     1

@Akrun:

df %>% 
  group_by(Date = as.Date(Datetime)) %>% 
  slice(which(Value==1)[1])
+3

An option is used here data.table. Convert 'data.frame' to 'data.table' ( setDT(df)), grouped by converting 'Datetime' to Date, specifying 'i' as Value==1, get the index of the first occurrence 1 ( .I[1]) and use this for a subset of the rows

library(data.table)
setDT(df)[df[Value==1, .I[1], .(as.Date(Datetime))]$V1]
#              Datetime Value
#1: 2016-12-01 23:50:00     1
#2: 2016-12-02 00:05:00     1
#3: 2016-12-04 12:10:00     1
+1
source

Source: https://habr.com/ru/post/1670251/


All Articles