A unique combination of all elements from two (or more) vectors

Question

A unique combination of all elements from two (or more) vectors

I am trying to create a unique combination of all elements from two vectors of different sizes in R.

For example, the first vector

> a <- c("ABC", "DEF", "GHI")

and the second is the dates currently stored as

 > b <- c("2012-05-01", "2012-05-02", "2012-05-03", "2012-05-04", "2012-05-05")

I need to create a data frame with two columns like this

 > data ab 1 ABC 2012-05-01 2 ABC 2012-05-02 3 ABC 2012-05-03 4 ABC 2012-05-04 5 ABC 2012-05-05 6 DEF 2012-05-01 7 DEF 2012-05-02 8 DEF 2012-05-03 9 DEF 2012-05-04 10 DEF 2012-05-05 11 GHI 2012-05-01 12 GHI 2012-05-02 13 GHI 2012-05-03 14 GHI 2012-05-04 15 GHI 2012-05-05

So, basically, I'm looking for a unique combination, considering all the elements of one vector (a), compared with all the elements of the second vector (b).

An ideal solution will generalize to a larger number of input vectors.

See also:
How to create a combination matrix

+70

r r-faq

Godel Jul 09 '12 at 2:10

source share

4 answers

shhhhimhuntingrabbits · Answer 1 · 2012-07-09 02:13

is it possible that you after

 > expand.grid(a,b) Var1 Var2 1 ABC 2012-05-01 2 DEF 2012-05-01 3 GHI 2012-05-01 4 ABC 2012-05-02 5 DEF 2012-05-02 6 GHI 2012-05-02 7 ABC 2012-05-03 8 DEF 2012-05-03 9 GHI 2012-05-03 10 ABC 2012-05-04 11 DEF 2012-05-04 12 GHI 2012-05-04 13 ABC 2012-05-05 14 DEF 2012-05-05 15 GHI 2012-05-05

If the order received is not what you want, you can sort it later. If you specify expand.grid arguments, they will become column names:

 df = expand.grid(a = a, b = b) df[order(df$a), ]

And expand.grid generalizes to any number of input columns.

hypothesis · Answer 2 · 2018-06-20 21:37

The tidyr package provides a nice alternative crossing that works better than the classic expand.grid function because (1) rows are not converted to factors and (2) sorting is more intuitive:

 library(tidyr) a <- c("ABC", "DEF", "GHI") b <- c("2012-05-01", "2012-05-02", "2012-05-03", "2012-05-04", "2012-05-05") crossing(a, b) # A tibble: 15 x 2 ab <chr> <chr> 1 ABC 2012-05-01 2 ABC 2012-05-02 3 ABC 2012-05-03 4 ABC 2012-05-04 5 ABC 2012-05-05 6 DEF 2012-05-01 7 DEF 2012-05-02 8 DEF 2012-05-03 9 DEF 2012-05-04 10 DEF 2012-05-05 11 GHI 2012-05-01 12 GHI 2012-05-02 13 GHI 2012-05-03 14 GHI 2012-05-04 15 GHI 2012-05-05

izan · Answer 3 · 2018-06-03 18:32

You can use the order function to sort any number of columns. for your example

 df <- expand.grid(a,b) > df Var1 Var2 1 ABC 2012-05-01 2 DEF 2012-05-01 3 GHI 2012-05-01 4 ABC 2012-05-02 5 DEF 2012-05-02 6 GHI 2012-05-02 7 ABC 2012-05-03 8 DEF 2012-05-03 9 GHI 2012-05-03 10 ABC 2012-05-04 11 DEF 2012-05-04 12 GHI 2012-05-04 13 ABC 2012-05-05 14 DEF 2012-05-05 15 GHI 2012-05-05 > df[order( df[,1], df[,2] ),] Var1 Var2 1 ABC 2012-05-01 4 ABC 2012-05-02 7 ABC 2012-05-03 10 ABC 2012-05-04 13 ABC 2012-05-05 2 DEF 2012-05-01 5 DEF 2012-05-02 8 DEF 2012-05-03 11 DEF 2012-05-04 14 DEF 2012-05-05 3 GHI 2012-05-01 6 GHI 2012-05-02 9 GHI 2012-05-03 12 GHI 2012-05-04 15 GHI 2012-05-05'

Jaap · Answer 4 · 2019-01-29 08:50

There is no CJ -function from data.table -package in this review. Via:

 library(data.table) CJ(a = a, b = b, unique = TRUE)

gives:

  ab 1: ABC 2012-05-01 2: ABC 2012-05-02 3: ABC 2012-05-03 4: ABC 2012-05-04 5: ABC 2012-05-05 6: DEF 2012-05-01 7: DEF 2012-05-02 8: DEF 2012-05-03 9: DEF 2012-05-04 10: DEF 2012-05-05 11: GHI 2012-05-01 12: GHI 2012-05-02 13: GHI 2012-05-03 14: GHI 2012-05-04 15: GHI 2012-05-05

In the latest version of data.table, you can simply use: CJ(a, b, unique = TRUE)

A unique combination of all elements from two (or more) vectors

More articles: