1. the zoo . The zoo package has a multi-user merge function that can do this compactly. lapply
converts each component of myList
into a zoo object, and then we simply merge them all:
# optionally add nice names to the list names(myList) <- paste("t", seq_along(myList), sep = "") library(zoo) fz <- function(x)with(as.data.frame(x, stringsAsFactors=FALSE), zoo(Freq, Var1))) out <- do.call(merge, lapply(myList, fz))
The above returns a multi-dimensional series of zoos in which "times" are "a"
, "ago"
, etc., but if the result of a data frame was desired, then this is just a matter of as.data.frame(out)
.
2. Reduce . Here is the second solution. It uses Reduce
in the R core.
merge1 <- function(x, y) merge(x, y, by = 1, all = TRUE) out <- Reduce(merge1, lapply(myList, as.data.frame, stringsAsFactors = FALSE)) # optionally add nice names colnames(out)[-1] <- paste("t", seq_along(myList), sep = "")
3. xtabs . This adds the names to the list and then extracts the frequencies, names and groups as one long vector, each of which puts them back using xtabs
:
names(myList) <- paste("t", seq_along(myList)) xtabs(Freq ~ Names + Group, data.frame( Freq = unlist(lapply(myList, unname)), Names = unlist(lapply(myList, names)), Group = rep(names(myList), sapply(myList, length)) ))
Benchmark
Comparing some solutions using the rbenchmark package, we get the following, which indicates that the zoo's solution is the fastest on sample data and possibly the easiest.
> t1<-table(strsplit(tolower("this is a test in the event of a real word file you would see many more words here"), "\\W")) > t2<-table(strsplit(tolower("Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal"), "\\W")) > t3<-table(strsplit(tolower("Ask not what your country can do for you - ask what you can do for your country"), "\\W")) > myList <- list(t1, t2, t3) > > library(rbenchmark) > library(zoo) > names(myList) <- paste("t", seq_along(myList), sep = "") > > benchmark(xtabs = { + names(myList) <- paste("t", seq_along(myList)) + xtabs(Freq ~ Names + Group, data.frame( + Freq = unlist(lapply(myList, unname)), + Names = unlist(lapply(myList, names)), + Group = rep(names(myList), sapply(myList, length)) + )) + }, + zoo = { + fz <- function(x) with(as.data.frame(x, stringsAsFactors=FALSE), zoo(Freq, Var1)) + do.call(merge, lapply(myList, fz)) + }, + Reduce = { + merge1 <- function(x, y) merge(x, y, by = 1, all = TRUE) + Reduce(merge1, lapply(myList, as.data.frame, stringsAsFactors = FALSE)) + }, + reshape = { + freqs.list <- mapply(data.frame,Words=seq_along(myList),myList,SIMPLIFY=FALSE,MoreArgs=list(stringsAsFactors=FALSE)) + freqs.df <- do.call(rbind,freqs.list) + reshape(freqs.df,timevar="Words",idvar="Var1",direction="wide") + }, replications = 10, order = "relative", columns = c("test", "replications", "relative")) test replications relative 2 zoo 10 1.000000 4 reshape 10 1.090909 1 xtabs 10 1.272727 3 Reduce 10 1.272727
ADDED: second solution.
ADDED: third solution.
ADDED: reference.