R combines three data frames without forming a Cartesian product

I have the following dataframes a, b, c

Year<-rep(c("2002","2003"),1)
Crop<-c("TTT","RRR")
a<-data.frame(Year,Crop)

Year<-rep(c("2002","2003"),2)
ProductB<-c("A","A","B","B")
b<-data.frame(Year,ProductB)

Year<-rep(c("2002","2003"),3)
Location<-c("XX","XX","YY","YY","ZZ","ZZ")
c<-data.frame(Year,Location)

and want to bring them together. When I use a function merge, I get a Cartesian product that is not what I want.

d<-merge(a,b,by="Year")
e<-merge(d,c,by="Year")

I would like the dataframe to look like

Year   Crop    ProductB    Location
 2002  TTT      A              XX
 2002   NA      B              YY
 2002   NA      NA             ZZ
 2003  RRR      A              XX 
 2003   NA      B              YY
 2003   NA      NA             ZZ

Is it possible? Thank you for your help.

+4
source share
2 answers

Here is one way: data.table.

require(data.table) ## 1.9.2
# (1)
setDT(a)[, GRP := 1:.N, by=Year]
setDT(b)[, GRP := 1:.N, by=Year]
setDT(c)[, GRP := 1:.N, by=Year]
# (2)
merge(a, merge(b, c, by=c("Year", "GRP"), 
          all=TRUE), by=c("Year", "GRP"), all=TRUE)

#    Year GRP Crop ProductB Location
# 1: 2002   1  TTT        A       XX
# 2: 2002   2   NA        B       YY
# 3: 2002   3   NA       NA       ZZ
# 4: 2003   1  RRR        A       XX
# 5: 2003   2   NA        B       YY
# 6: 2003   3   NA       NA       ZZ
  • (1) - setDTConverts data.frameto data.table, and then we create a new column GRPby grouping Year. Moreover, we have a unique combination Year, Grp.
  • (2) - we merge in two columns Year, Grp.

.N - , .

+5

Arun , , . .

-. . , by = NULL, :

merge(a, b, by = "Year")
merge(a, b, by = NULL)

-. , . . ( ) , , , TTT A XX, , , NA ZZ? B YY, ZZ, ?

EDIT: , Arun merge data.table.

a$Grp <- seq_len(nrow(a))
b$Grp <- seq_len(nrow(b))
c$Grp <- seq_len(nrow(c))

d <- merge(a, b, by = c("Year", "Grp"), all = TRUE)
e <- merge(d, c, by = c("Year", "Grp"), all = TRUE)
e[,-2]
+2

Source: https://habr.com/ru/post/1542234/


All Articles