As.data.frame table () for summing frequencies

In R, I am looking for an efficient way to create a tabular summary of data as follows.

Take, for example, data.frame foothat I used table()to summarize, and then as.data.frame()to get the frequency.

foo <- data.frame(x= c('a', 'a', 'a', 'b', 'b', 'b'), y=c('ab', 'ac', 'ad', 'ae', 'fx', 'fy'))
bar <- as.data.frame(table(foo), stringsAsFactors=F)

This leads to the following number of frequencies for bar

   x  y Freq
1  a ab    1
2  b ab    0
3  a ac    1
4  b ac    0
5  a ad    1
6  b ad    0
7  a ae    0
8  b ae    1
9  a fx    0
10 b fx    1
11 a fy    0
12 b fy    1

The problem I am facing is when there are many levels xand yit starts to use a significant part of the memory> 64 GB. I was wondering if there is an alternative way to do so many frequencies. As a first step I installed stringsAsFactors=F, however this does not completely solve the problem.

+3
source share
3

() . , , . ninteraction plyr .

tab <- function(df, drop = TRUE) {
  id <- plyr::ninteraction(df)
  ord <- order(id)

  df <- df[ord, , drop = FALSE]
  id <- id[ord]

  freq <- rle(id)$lengths
  labels <- unrowname(df[cumsum(freq), , drop = FALSE])

  data.frame(labels, freq)
}
+4

xtabs Matrix, -.

+1
library(plyr)
ddply(foo, ~ x + y, nrow,.drop=FALSE)
+1

Source: https://habr.com/ru/post/1742848/


All Articles