Getting all possible two subsets of columns

I am a relative newbie to R, and now I am very close to ending up with a rather long script thanks to everyone who has helped me so far at different stages. I have one more thing that I'm stuck with. I simplified this question:

Dataset1 ax ay 1 3 2 4 Dataset2 bx by 5 7 6 8 A <- dataset1 B <- dataset2 a <- 2 #number of columns b <- 1:2 

(my data sets will vary in the number of columns, and therefore I need to be able to change this coefficient)

I need this answer in any order (i.e. all possible combinations of two columns, one from each of the two datasets), as it is or equivalent.

 [[1]] 1 5 2 6 [[2]] 1 7 2 8 [[3]] 3 5 4 6 [[4]] 3 7 4 8 

But I do not understand. I tried a bunch of things, and the closest thing to what I want was with this:

 i <- 1 for( i in 1:a ) { e <- lapply(B, function(x) as.data.frame(cbind(A, x))) print(e) i <- i+1 } 

Close, yes. I can accept the answer and do some kind of fake and a subset, but it is not, and there should be an easy way to do it. I have not seen anything like this in my searches. Any help is greatly appreciated.

+4
source share
3 answers

I think the easiest way to do this is very similar to what you tried, using two explicit loops. However, there are some more things that I would do differently:

  • Preallocate List Space
  • Use explicit counter
  • Use drop=FALSE

Then you can do the following.

 A <- read.table(text = "ax ay 1 3 2 4", header = TRUE) B <- read.table(text = "bx by 5 7 6 8", header = TRUE) out <- vector("list", length = ncol(A) * ncol(B)) counter <- 1 for (i in 1:ncol(A)) { for (j in 1:ncol(B)) { out[[counter]] <- cbind(A[,i, drop = FALSE], B[,j, drop = FALSE]) counter <- counter + 1 } } out ## [[1]] ## ax bx ## 1 1 5 ## 2 2 6 ## ## [[2]] ## ax by ## 1 1 7 ## 2 2 8 ## ## [[3]] ## ay bx ## 1 3 5 ## 2 4 6 ## ## [[4]] ## ay by ## 1 3 7 ## 2 4 8 
+1
source

Does something like this work for you?

 Dataset1 <- data.frame(ax=1:2,ay=3:4) Dataset2 <- data.frame(bx=5:6,by=7:8) apply( expand.grid(seq_along(Dataset1),seq_along(Dataset2)), 1, function(x) cbind(Dataset1[x[1]],Dataset2[x[2]]) ) 

Result:

 [[1]] ax bx 1 1 5 2 2 6 [[2]] ay bx 1 3 5 2 4 6 [[3]] ax by 1 1 7 2 2 8 [[4]] ay by 1 3 7 2 4 8 
+6
source

If I understand the question, I think you can use combn to select the columns you need. For example, if you want all combinations of 8 columns to take 2 at a time, you can do:

 combn(1:8, 2) 

What gives (partially for readability):

 combn(1:8,2)[,c(1:5, 15:18)] [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [1,] 1 1 1 1 1 3 3 3 3 [2,] 2 3 4 5 6 5 6 7 8 

So, the columns of this matrix can be used as indexes you want.

+1
source

Source: https://habr.com/ru/post/1484534/


All Articles