I have the following data. table
set.seed(1) DT <- data.table(VAL = sample(c(1, 2, 3), 10, replace = TRUE)) VAL 1: 1 2: 2 3: 2 4: 3 5: 1 6: 3 7: 3 8: 2 9: 2 10: 1
Inside each VAL number I want:
- Count the number of records / rows
- Create a row index (counter) of the first, second, third occurrence, etc.
In the end I want to get the result
VAL COUNT IDX 1: 1 3 1 2: 2 4 1 3: 2 4 2 4: 3 3 1 5: 1 3 2 6: 3 3 2 7: 3 3 3 8: 2 4 3 9: 2 4 4 10: 1 3 3
where "COUNT" is the number of records / rows for each "VAL", and "IDX" is the row index in each "VAL".
I tried working with which and length with .I :
dt[, list(COUNT = length(VAL == VAL[.I]), IDX = which(which(VAL == VAL[.I]) == .I))]
but this does not work, since .I refers to a vector with an index, so I assume that I need to use .I[] . Although inside .I[] I again run into the problem that I don't have a row index, and I know (from reading the data.table frequently asked questions and after the messages here) that line alternation should be avoided whenever possible.
So what is the data.table style?
r data.table
Simon Z. Nov 08 '13 at 21:50 2013-11-08 21:50
source share