In R, I am looking for an efficient way to create a tabular summary of data as follows.
Take, for example, data.frame foothat I used table()to summarize, and then as.data.frame()to get the frequency.
foo <- data.frame(x= c('a', 'a', 'a', 'b', 'b', 'b'), y=c('ab', 'ac', 'ad', 'ae', 'fx', 'fy'))
bar <- as.data.frame(table(foo), stringsAsFactors=F)
This leads to the following number of frequencies for bar
x y Freq
1 a ab 1
2 b ab 0
3 a ac 1
4 b ac 0
5 a ad 1
6 b ad 0
7 a ae 0
8 b ae 1
9 a fx 0
10 b fx 1
11 a fy 0
12 b fy 1
The problem I am facing is when there are many levels xand yit starts to use a significant part of the memory> 64 GB. I was wondering if there is an alternative way to do so many frequencies. As a first step I installed stringsAsFactors=F, however this does not completely solve the problem.
source
share