How can I effectively multiply large data.frame objects in a list?

Question

How can I effectively multiply large data.frame objects in a list?

I have a data.frame object in the list, I'm going to filter based on the last column (AKA, grade) of each of them accordingly. The subset for the list is intuitive for me, but I want to have two different sets (i.e. pass / fail) as a result of filtering for each data.frame object. I think that the way I used is not elegant, and is looking for a better / effective solution for it. Can someone tell me how to achieve a more elegant solution for this kind of problem? Many thanks!

toy details:

mylist <- list(df1=data.frame( from=seq(1, by=4, len=16), to=seq(3, by=4, len=16), score=sample(30, 16)),
               df2=data.frame( from=seq(3, by=7, len=20), to=seq(6, by=7, len=20), score=sample(30, 20)),
               df3=data.frame( from=seq(4, by=8, len=25), to=seq(7, by=8, len=25), score=sample(30, 25)))

my initial attempt:

pass <- lapply(mylist, function(ele_) {
  ans <- subset(ele_, ele_$score > 20)
  ans
})

It turns out that I also want to have my opposite set, where the instances do not satisfy the filter condition, and put pass, fail for each data.frame object in one list.

: data.frame , .

, data.frame ? - ?

+4

list r dataframe subset

Andy.Jian 21 . '16 17:09

1

akrun · Accepted Answer · 2016-09-21T17:14:05+0000

, data.table

library(data.table)
lapply(mylist, function(x) setDT(x)[score > 20])

filter dplyr map purrr

library(dplyr)
library(purrr)
mylist %>% 
      map(filter, score > 20)

list rbind (rbindlist from data.table bind_rows from dplyr ) .

rbindlist(mylist, idcol= 'grp')[score > 20, .SD , by = .(grp)]

dplyr

mylist %>% 
    bind_rows(., .id = 'grp') %>%
    group_by(grp) %>%
    filter(score > 20)

, data.frame a list 2 ( > 20 < 20 'score')

lapply(mylist, function(x) split(x, c("FAIL", "PASS")[(x$score > 20)+1]))

How can I effectively multiply large data.frame objects in a list?

More articles: