Merging a data frame from a list of data frames

I have a list of data frames that looks like this:

ls[[1]]
[[1]]

 month year   oracle
    1 2004 356.0000
    2 2004 390.0000
    3 2004 394.4286
    4 2004 391.8571 
 ls[[2]]
 [[2]]
 month year microsoft
    1 2004  339.0000
    2 2004  357.7143
    3 2004  347.1429
    4 2004  333.2857

How to create a single data frame that looks like this:

 month year   oracle   microsoft
    1 2004 356.0000    339.0000
    2 2004 390.0000    357.7143
    3 2004 394.4286    347.1429
    4 2004 391.8571    333.2857
+4
source share
3 answers

We could also use Reduce

Reduce(function(...) merge(..., by = c('month', 'year')), lst)

Using the @Jaap example, if the values ​​do not match, use the all=TRUEfrom parameter merge.

Reduce(function(...) merge(..., by = c('month', 'year'), all=TRUE), ls)
#     month year   oracle microsoft   google
#1     1 2004 356.0000        NA       NA
#2     2 2004 390.0000  339.0000       NA
#3     3 2004 394.4286  357.7143 390.0000
#4     4 2004 391.8571  347.1429 391.8571
#5     5 2004       NA  333.2857 357.7143
#6     6 2004       NA        NA 333.2857
+5
source

Using the code Reduce/ mergefrom @akrun's answer will work fine if the values ​​for the columns monthand are the yearsame for each data frame. However, when they do not match (sample data at the end of this answer)

Reduce(function(...) merge(..., by = c('month', 'year')), ls)

will only return rows that are common to each data frame:

  month year   oracle microsoft   google
1     3 2004 394.4286  357.7143 390.0000
2     4 2004 391.8571  347.1429 391.8571

all=TRUE ( @akrun) full_join dplyr , /:

library(dplyr)
Reduce(function(...) full_join(..., by = c('month', 'year')), ls) 
# or just:
Reduce(full_join, ls)

:

  month year   oracle microsoft   google
1     1 2004 356.0000        NA       NA
2     2 2004 390.0000  339.0000       NA
3     3 2004 394.4286  357.7143 390.0000
4     4 2004 391.8571  347.1429 391.8571
5     5 2004       NA  333.2857 357.7143
6     6 2004       NA        NA 333.2857

:

ls <- list(structure(list(month = 1:4, year = c(2004L, 2004L, 2004L, 2004L), oracle = c(356, 390, 394.4286, 391.8571)), .Names = c("month", "year", "oracle"), class = "data.frame", row.names = c(NA, -4L)), 
           structure(list(month = 2:5, year = c(2004L, 2004L, 2004L, 2004L), microsoft = c(339, 357.7143, 347.1429, 333.2857)), .Names = c("month", "year", "microsoft"), class = "data.frame", row.names = c(NA,-4L)),
           structure(list(month = 3:6, year = c(2004L, 2004L, 2004L, 2004L), google = c(390, 391.8571, 357.7143, 333.2857)), .Names = c("month", "year", "google"), class = "data.frame", row.names = c(NA,-4L)))
+4

You can also do do.call()as follows:

do.call(merge, ls)

+1
source

Source: https://habr.com/ru/post/1612354/


All Articles