Count the number of records and create a row number in each group in the data table.

I have the following data. table

set.seed(1) DT <- data.table(VAL = sample(c(1, 2, 3), 10, replace = TRUE)) VAL 1: 1 2: 2 3: 2 4: 3 5: 1 6: 3 7: 3 8: 2 9: 2 10: 1 

Inside each VAL number I want:

  • Count the number of records / rows
  • Create a row index (counter) of the first, second, third occurrence, etc.

In the end I want to get the result

  VAL COUNT IDX 1: 1 3 1 2: 2 4 1 3: 2 4 2 4: 3 3 1 5: 1 3 2 6: 3 3 2 7: 3 3 3 8: 2 4 3 9: 2 4 4 10: 1 3 3 

where "COUNT" is the number of records / rows for each "VAL", and "IDX" is the row index in each "VAL".

I tried working with which and length with .I :

  dt[, list(COUNT = length(VAL == VAL[.I]), IDX = which(which(VAL == VAL[.I]) == .I))] 

but this does not work, since .I refers to a vector with an index, so I assume that I need to use .I[] . Although inside .I[] I again run into the problem that I don't have a row index, and I know (from reading the data.table frequently asked questions and after the messages here) that line alternation should be avoided whenever possible.

So what is the data.table style?

+42
r data.table
Nov 08 '13 at 21:50
source share
1 answer

Using .N ...

 DT[ , `:=`( COUNT = .N , IDX = 1:.N ) , by = VAL ] # VAL COUNT IDX # 1: 1 3 1 # 2: 2 4 1 # 3: 2 4 2 # 4: 3 3 1 # 5: 1 3 2 # 6: 3 3 2 # 7: 3 3 3 # 8: 2 4 3 # 9: 2 4 4 #10: 1 3 3 

.N - the number of records in each group with groups defined by "VAL" .

+65
Nov 08 '13 at 21:53
source share



All Articles