Data.table not recognizing a logical filter

in the next snippet, data.table does not seem to recognize logic when used in i.

All my attempts to reproduce the problem in the minimal example failed, so I am posting the full section here. I expect this to be related to the "as.logical (cumsum (CURRENT_TRIP)" part, but just the gut feeling ...

# Testdata timetable <- data.table(rbind( c("r1", "t1_1", "p1", 10, 10), c("r1", "t1_1", "p2", 11, 11), c("r1", "t1_1", "p3", 12, 12), c("r1", "t1_1", "p4", 13, 13), c("r1", "t1_1", "p5", 14, 14), c("r1", "t1_1", "p6", 15, 15), c("r1", "t1_1", "p7", 16, 16), c("r1", "t1_1", "p8", 17, 17), c("r1", "t1_1", "p9", 18, 18), c("r1", "t1_1", "p10", 19, 19), c("r2", "t2", "p11", 9, 9), c("r2", "t2", "p12", 10, 10), c("r2", "t2", "p3", 11, 11), c("r2", "t2", "p13", 12, 12), c("r2", "t2", "p14", 13, 13), c("r2", "t2", "p15", 14, 14), c("r2", "t2", "p16", 15, 15), c("r2", "t2", "p17", 16, 16), c("r2", "t2", "p18", 17, 17) )) setnames(timetable, c("ROUTE", "TRIP", "STOP", "ARRIVAL", "DEPARTURE")) timetable[, ':='(ARRIVAL = as.integer(ARRIVAL), DEPARTURE = as.integer(DEPARTURE))] # Input startStation <- "p3" startTime <- 8 setorder(timetable, TRIP, ARRIVAL) timetable[, ID := .I] timetable[,':='(ARR_ROUND_PREV = Inf, ARR_ROUND = Inf, ARR_BEST = Inf, MARKED = F, CURRENT_TRIP = F)] timetable[STOP == startStation, ':='(ARR_ROUND_PREV = startTime, ARR_ROUND = startTime, ARR_BEST = startTime, MARKED = T)] routes <- timetable[MARKED == T, unique(ROUTE)] ids <- timetable[MARKED == T & DEPARTURE > ARR_ROUND, .(ID = ID[DEPARTURE == min(DEPARTURE)]), by = ROUTE][, ID] timetable[ID %in% ids, CURRENT_TRIP := T] timetable[, MARKED := F] trips <- timetable[CURRENT_TRIP == T, unique(TRIP)] timetable[TRIP %in% trips, CURRENT_TRIP := as.logical(cumsum(CURRENT_TRIP)), by = TRIP] # ? timetable nrow(timetable[CURRENT_TRIP == T]) #8 sum(timetable$CURRENT_TRIP == T) #15 # but nrow(timetable[CURRENT_TRIP > 0]) #15 nrow(timetable[CURRENT_TRIP == 1L]) #15 

any ideas?

The problem is detected using the latest versions 1.9.7 and 1.9.6 and R 3.2.3 on Win 64bit

Fab

+5
source share
1 answer

You have exactly the same error as mine !!!

Strange problem with finding rows in data.table

I also could not reproduce it with minimal code!

My solution for your code is changing the setting of the CURRENT_TRIP column.

 timetable[ID %in% ids]$CURRENT_TRIP <- T timetable[, MARKED := F] trips <- timetable[CURRENT_TRIP == T, unique(TRIP)] timetable[TRIP %in% trips]$CURRENT_TRIP <- timetable[,as.logical(cumsum(CURRENT_TRIP)), by = TRIP]$V1 # ? timetable nrow(timetable[CURRENT_TRIP == T]) #8 sum(timetable$CURRENT_TRIP == T) #15 # but nrow(timetable[CURRENT_TRIP > 0]) #15 nrow(timetable[CURRENT_TRIP == 1L]) #15 

Using the dT [, Column: = T] notation to set up columns also caused me the same problem! I am not sure why and I am communicating with the creator of data.tables to fix this!

+2
source

Source: https://habr.com/ru/post/1239021/


All Articles