Consider unique column values in pairs by combinations of another column and group by the third column in R

Question

Consider unique column values in pairs by combinations of another column and group by the third column in R

Quite a challenge, to be honest. This is basically an extension of the question I asked earlier - Count unique column values in pairs by combinations of another column in R

Let's say this time I have the following data frame in R:

data.frame(Reg.ID = c(1,1,2,2,2,3,3), Location = c("X","X","Y","Y","Y","X","X"), Product = c("A","B","A","B","C","B","A"))

The data is as follows:

  Reg.ID Location Product 1 1 XA 2 1 XB 3 2 YA 4 2 YB 5 2 YC 6 3 XB 7 3 XA

I would like to count the unique values of the Reg.ID column by pairwise combinations of the values in the Product column, grouped by the Location column. The result should look like this:

  Location Prod.Comb Count 1 XA,B 2 2 YA,B 1 3 YA,C 1 4 YB,C 1

I tried to get the result using the basic R functions, but did not get any success. I assume there is a fairly simple solution using the data.table package in R?

Any help would be greatly appreciated. Thanks!

+5

r dataframe data.table

sharmanas Feb 13 '17 at 10:33

source share

2 answers

A dplyr , plagiarizing from the question you mentioned:

 library(dplyr) df <- data.frame(Reg.ID = c(1,1,2,2,2,3,3), Location = c("X","X","Y","Y","Y","X","X"), Product = c("A","B","A","B","C","B","A"), stringsAsFactors = FALSE) df %>% full_join(df, by="Location") %>% filter(Product.x < Product.y) %>% group_by(Location, Product.x, Product.y) %>% summarise(Count = length(unique(Reg.ID.x))) %>% mutate(Prod.Comb = paste(Product.x, Product.y, sep=",")) %>% ungroup %>% select(Location, Prod.Comb, Count) %>% arrange(Location, Prod.Comb) # # A tibble: 4 × 3 # Location Prod.Comb Count # <chr> <chr> <int> # 1 XA,B 2 # 2 YA,B 1 # 3 YA,C 1 # 4 YB,C 1

+2

Scarabee Feb 13 '17 at 23:48

source share

Bulat · Accepted Answer · 2017-02-13T23:08:18+0000

Not a lot of tested idea, but this is what comes to mind first with data.table :

 library(data.table) dt <- data.table(Reg.ID = c(1,1,2,2,2,3,3), Location = c("X","X","Y","Y","Y","X","X"), Product = c("A","B","A","B","C","B","A")) dt.cj <- merge(dt, dt, by ="Location", all = T, allow.cartesian = T) dt.res <- dt.cj[Product.x < Product.y, .(cnt = length(unique(Reg.ID.x))),by = .(Location, Product.x, Product.y)] # Location Product.x Product.y cnt # 1: XAB 2 # 2: YAB 1 # 3: YAC 1 # 4: YBC 1

Consider unique column values ​​in pairs by combinations of another column and group by the third column in R

More articles:

Consider unique column values in pairs by combinations of another column and group by the third column in R