Select a row from the data.table with the minimum value

I have it data.table, and I need to calculate some new value on it and select a row with the minvalue.

tb <- data.table(g_id=c(1, 1, 1, 2, 2, 2, 3),
          item_no=c(24,25,26,27,28,29,30),
          time_no=c(100, 110, 120, 130, 140, 160, 160),
          key="g_id")

#    g_id item_no time_no
# 1:    1      24     100
# 2:    1      25     110
# 3:    1      26     120
# 4:    2      27     130
# 5:    2      28     140
# 6:    2      29     160
# 7:    3      30     160

ts  <- 118
gId <- 2

tb[.(gId), list(item_no, tdiff={z=abs(time_no - ts)})]

#    g_id item_no tdiff
# 1:    2      27    12
# 2:    2      28    22
# 3:    2      29    42

And now I need to get a string (actually only item_nothis string) with a minimumtdiff

#    g_id item_no tdiff
# 1:    2      27    12

Is it possible to do this in one operation with tb? What is the fastest way to do this (because I need to do this operation around 500,000 rows)?

+4
source share
2 answers

You can try chaining .SDand [][].

The problem, in my opinion, is that you update the new column first, then find the minimum tdiff

library(data.table)
tb <- data.table(g_id=c(1, 1, 1, 2, 2, 2, 3),
             item_no=c(24,25,26,27,28,29,30),
             time_no=c(100, 110, 120, 130, 140, 160, 160),
             key="g_id")

ts <- 118

# My solution is quite simple
tb[, tdiff := list(tdiff=abs(time_no - ts))][, .SD[which.min(tdiff)], by = key(tb)]

, .SD . :=

:

   g_id item_no time_no tdiff
1:    1      26     120     2
2:    2      27     130    12
3:    3      30     160    42
+3

data.table [][][], , , g_id:

tb[.(gId), list(item_no, tdiff={z=abs(time_no - ts)})][,item_no[which.min(tdiff)],by=g_id]

+1

Source: https://habr.com/ru/post/1534008/


All Articles