First make data.tables of them:
dt.lst = lapply(data.lst, as.data.table)
Stacking. For comparison, here is a quick way related to styling:
res0 = rbindlist(dt.lst)[, .(n = .N), by=V1:V2]
The OP said this was not possible because the intermediate result made by rbindlist would be too large.
Enumeration first. With a small range of values, I suggest listing them all ahead:
res1 = CJ(V1 = 1:1000, V2 = 1:1000)[, n := 0L] for (k in seq_along(dt.lst)) res1[ dt.lst[[k]], n := n + .N, by=.EACHI ] fsetequal(res0, res1[n>0]) # TRUE
The OP pointed out that there are 1e12 possible values, so this does not seem to be a good idea. Instead, we can use
res2 = dt.lst[[1L]][0L] for (k in seq_along(dt.lst)) res2 = funion(res2, dt.lst[[k]]) res2[, n := 0L] setkey(res2, V1, V2) for (k in seq_along(dt.lst)) res2[ dt.lst[[k]], n := n + .N, by=.EACHI ] fsetequal(res0, res2) # TRUE
This is the slowest of the three variants of the above example, but it seems to me better for me in the light of the problems associated with OP.
Growing inside the loop. Finally...
res3 = dt.lst[[1L]][0L][, n := NA_integer_][] for (k in seq_along(dt.lst)){ setkey(res3, V1, V2) res3[dt.lst[[k]], n := n + .N, by=.EACHI ] res3 = rbind( res3, fsetdiff(dt.lst[[k]], res3[, !"n", with=FALSE], all=TRUE)[, .(n = .N), by=V1:V2] ) } fsetequal(res0, res3) # TRUE
Growing objects inside the loop are very discouraged and inefficient in R, but this allows you to do this in one loop instead of two.
Other options and notes. I suspect it is best to use a hash. They are available in the hash package and probably also through the Rcpp package.
fsetequal , fsetdiff and funion are the latest additions to the package development version. Find out more on the official website of the data.table project.
By the way, if the entries inside each matrix are different, you can replace .N with 1L everywhere above and discard by=.EACHI and all=TRUE .