R data.table with comparison

I am using data.table for a quick subset. However, if I'm a subset not based on keys equal to the value, but smaller, it takes a lot of time. For instance:

DT["2"]

fast as well

DT[key<2]

slow.

I assume that the first is a binary search, and the second is a vector scan, but how to make the second a quick way?

Thanks for answers.

+4
source share
1 answer

Usually, when you multiply in a key column, to use a fast subset based on a binary search, you should:

DT[J(values)] # assuming subset here is on the first key column.
# (or)
DT[.(values)] # idem

. J . character, , data.table , , J ., .

DT["a"]       # subset on the first key column if one exists
# (or)
DT[J("a")]    # idem
# (or)
DT[.("a")]    # idem

. , data.table i . , , . DT[2], 2 numeric, data.table , . .

DT[J(.)] , , , , , . DT[x < .] . , x a, x. , .

, , . . , . .

.

PS: , DT["2"] - , DT[key < 2], key . . ( ) DT[J(2)].

, . DT[J(2)] , 2 DT, as DT[key < 2] [min[key], 2).

+4

Source: https://habr.com/ru/post/1532453/


All Articles