You should consider using data processing tools in the plyr library.
library(plyr) startdate <- ISOdate(2011, 1, 1) userdata <- data.frame( date = startdate + rep(1:31, each=3), userID = 1 + round(9*runif(93)), x = round(100*runif(93)) ) summary <- ddply(userdata, .(userID), summarize, activedays=length(date)) summary[summary$activedays >= 30, ]
You can learn more about plyr at the excellent Hadley website: http://had.co.nz/plyr/
source share