Combining list items according to a common data frame value

The next question is here , although a specific example, it seems like a generic application, so I believe that there is a separate thread:

General question: how to take the elements in the list that correspond to the value in the original data frame and combine them according to this value in the original data frame, especially when the elements of the list have different lengths

In this example, I have a dataframe that has two groups, each of which is sorted by date. What I ultimately want to do is get a data framework organized by date that has only relevant metrics for each segment. If there is no data for a certain date on a particular segment, it gets 0.

Here are some evidence:

structure(list(date = structure(c(15706, 15707, 15708, 15709, 
15710, 15706, 15707, 15708), class = "Date"), segment = structure(c(1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c("abc", "xyz"), class = "factor"), 
    a = c(76L, 92L, 96L, 76L, 80L, 91L, 54L, 62L), x = c(964L, 
    505L, 968L, 564L, 725L, 929L, 748L, 932L), k = c(27L, 47L, 
    36L, 40L, 33L, 46L, 30L, 36L), value = c(6872L, 5993L, 5498L, 
    5287L, 6835L, 6622L, 5736L, 7218L)), .Names = c("date", "segment", 
"a", "x", "k", "value"), row.names = c(NA, -8L), class = "data.frame")

So, for the “abc” segment, I ONLY care about (value / a) relative to my reference 75. and for the “xyz” segment, I ONLY care about (k / x) relative to my 0.04 reference.

Ultimately, I want the data frame to look like this:

        date   abc   xyz
1 2013-01-01  0.21  0.24
2 2013-01-02 -0.13  0.00
3 2013-01-03 -0.24 -0.03
4 2013-01-04 -0.07  0.00
5 2013-01-05  0.14  0.00

Where, since "xyz" had only information for 2013-01-01 to 2013-01-03, it gets 0 for everything after.

How did I get to this point:

define arguments passed in mapply

splits <- split(test, test$segment)
metrics <- c("ametric","xmetric")
benchmarks <- c(75,0.04)

and function to improve performance over benchmark

performance <- function(splits,metrics,benchmarks){
    (splits[,metrics]/benchmarks)-1
}

Pass them to mapply:

temp <- mapply(performance, splits, metrics, benchmarks)

Now the problem is that since the breaks were of different lengths, the result is as follows:

summary(temp)

    Length Class  Mode   
abc 5      -none- numeric
xyz 3      -none- numeric

Is there a way to enter dates from the original data frame for each partition and combine according to these dates (with 0 where there is no data)?

+1
1

SIMPLIFY=FALSE mapply, do.call rbind, :

> temp <- mapply(performance, splits, metrics, benchmarks)
> do.call('rbind',mapply(cbind, splits, performance=temp, SIMPLIFY=FALSE))
            date segment  a   x  k value  performance
abc.1 2013-01-01     abc 76 964 27  6872 1.333333e-02
abc.2 2013-01-02     abc 92 505 47  5993 2.266667e-01
abc.3 2013-01-03     abc 96 968 36  5498 2.800000e-01
abc.4 2013-01-04     abc 76 564 40  5287 1.333333e-02
abc.5 2013-01-05     abc 80 725 33  6835 6.666667e-02
xyz.6 2013-01-01     xyz 91 929 46  6622 2.322400e+04
xyz.7 2013-01-02     xyz 54 748 30  5736 1.869900e+04
xyz.8 2013-01-03     xyz 62 932 36  7218 2.329900e+04
0

Source: https://habr.com/ru/post/1529569/


All Articles