Mark the beginning and end of groups

Consider the structure of the data.tableform

     seller    buyer      month  
1: 50536344 61961225 1993-01-01  
2: 50536344 61961225 1993-02-01 
3: 50536344 61961225 1993-04-01 
4: 50536344 61961225 1993-05-01 
5: 50536344 61961225 1993-06-01

where I have (buyer, seller)couples over time. I want to mark the beginning and end of each pair. For example, we see that there was a couple from January to February, not one in March and one from April to June. Therefore, the expected result will be the following:

     seller    buyer      month  start    end
1: 50536344 61961225 1993-01-01   True  False
2: 50536344 61961225 1993-02-01  False   True
3: 50536344 61961225 1993-04-01   True  False
4: 50536344 61961225 1993-05-01  False  False
5: 50536344 61961225 1993-06-01  False   True
0
source share
1 answer

Assuming what monthis in the class Date(or similarly for POSIXt, IDateTimeor other classes with a method diff), you can use the function diffto do this.

# sort data.table
setkeyv(dt, c("seller", "buyer", "month"))
# define start
dt[, start := c(TRUE, diff(month) > 31), by = list(seller, buyer)]
# define end
dt[, end := c(diff(month) > 31, TRUE), by = list(seller, buyer)]

EDIT: @ : , , . , .

dt[, ":=" (start = c(TRUE, diff(month) > 31),
           end = c(diff(month) > 31, TRUE)), 
   by = list(seller, buyer)]

EDIT2: , : , start = c(TRUE, ...). , (31 ), diff(month) > 31. , c(TRUE, diff(month) > 31). , , .

+2

Source: https://habr.com/ru/post/1621046/


All Articles