You can use two things here. Firstly, this is how you get the most frequent element in a vector:
> v = c(1,1,1,2,2) > names(which.max(table(v))) [1] "1"
This is the meaning of the symbol, but we can easily use as.numeric on it if necessary.
As soon as we learn how to do this, we can use the grouping functionality of the data.table package to evaluate for each element for which the most common category is. Here is the code for your example above:
> dt = data.table(item=c(1,1,1,1,2,2,2,2), category=c(2,3,2,2,2,3,1,1)) > dt item category 1: 1 2 2: 1 3 3: 1 2 4: 1 2 5: 2 2 6: 2 3 7: 2 1 8: 2 1 > dt[,as.numeric(names(which.max(table(category)))),by=item] item V1 1: 1 2 2: 2 1
The new column V1 contains a numerical version of the most common category for each item. If you want to give it the correct name, the syntax is a little uglier:
> dt[,list(mostFreqCat=as.numeric(names(which.max(table(category))))),by=item] item mostFreqCat 1: 1 2 2: 2 1