Llply operations on multiple data frames

Is there an easy way (i.e. not to use "for" loops) to do the following:

I have a couple of data frames. I want to use the plyr operation to sum them up. In this example, I have two data frames: east and west, and I want to summarize both of them with costs and litigation by country.

Here is an example of data frames:

west <- data.frame(
    spend = sample(50:100,50,replace=T),
    trials = sample(100:200,50,replace=T),
    country = sample(c("usa","canada","uk"),50,replace = T)
    )

east <- data.frame(
    spend = sample(50:100,50,replace=T),
    trials = sample(100:200,50,replace=T),
    country = sample(c("china","japan","skorea"),50,replace = T)
    )

and a combined list of both data frames:

combined <- c(west,east)

What I want to do is an operation like ddply on both of these data frames at the same time, and the output will be a list (at least it seems the simplest). For example, if I worked on only one data frame, it would be something like this:

country.df <- ddply(west, .(country), summarise,
    spend = sum(spend),
    trials = sum(trials)
)

. llply, ( , - ):

countries.list <- llply(combined, .(country), summarise,
    spend = sum(spend),
    trials = sum(trials)
)

: " FUN (X [[1L]],...): "

... , , apply. , llply " ", , .

?

+4
2

:

combined <- list(east, west)

lapply(combined, ddply, .(country), summarise, spend  = sum(spend),
                                               trials = sum(trials))

# [[1]]
#   country spend trials
# 1   china  1572   2976
# 2   japan  1075   1989
# 3  skorea  1262   2526
# 
# [[2]]
#   country spend trials
# 1  canada  1459   3117
# 2      uk   910   1967
# 3     usa  1248   2660
+5

, dplyr, plyr . dplyr , IMHO , plyr. , ( , :))

combine = list(west = west, east = east)
library(dplyr)
lapply(combined, function(dat){
   dat %.%
     group_by(country) %.%
     summarise(
       trials = sum(trials),
       spend = sum(spend)
     ) %.%
     mutate(
       status = ifelse(trials < 1000, "Good", "Bad")
     )
})

. , data.table. , dplyr data.table plyr :)

library(data.table)
lapply(combined, function(dat){
  data.table(dat)[
  , list(trials = sum(trials), spend = sum(spend)),country][
  , status := ifelse(trials < 1000, "Good", "Bad")]
})

2: dplyr

lapply(combined, chain, group_by(country),
  summarise(trials = sum(trials), spend = sum(spend)),
  mutate(status = ifelse(trials < 1000, "Good", "Bad"))
)
+7

Source: https://habr.com/ru/post/1523735/


All Articles