Removing duplicates and small vectors from the list

Question

Removing duplicates and small vectors from the list

I have a list of vectors, say:

li <- list( c(1, 2, 3), c(1, 2, 3, 4), c(2, 3, 4), c(5, 6, 7, 8, 9, 10, 11, 12), numeric(0), c(5, 6, 7, 8, 9, 10, 11, 12, 13) )

And I would like to remove all vectors that are already contained in others (greater than or equal to), as well as all empty vectors

In this case, I will only have a list

 1 2 3 4 5 6 7 8 9 10 11 12 13

Is there any useful feature to achieve this?

Thank you in advance

+6

r

Rugero Jun 16 '15 at 13:31

source share

2 answers

x is contained in y if

 length(setdiff(x, y)) == 0

You can apply it to each pair of vectors using functions such as expand.grid or combn.

0

Michele usuelli Jun 16 '15 at 13:34

source share

bgoldst · Accepted Answer · 2015-06-16T14:22:32+0000

First, you must sort the list by the length of the vector, so that in the excision cycle it is guaranteed that each vector of the lower index is shorter than each vector with a higher index, so you need a one-way setdiff() .

 l <- list(1:3, 1:4, 2:4, 5:12, double(), 5:13 ); ls <- l[order(sapply(l,length))]; i <- 1; while (i <= length(ls)-1) if (length(ls[[i]]) == 0 || any(sapply((i+1):length(ls),function(i2) length(setdiff(ls[[i]],ls[[i2]]))) == 0)) ls[[i]] <- NULL else i <- i+1; ls; ## [[1]] ## [1] 1 2 3 4 ## ## [[2]] ## [1] 5 6 7 8 9 10 11 12 13

Here is a small alternative, replacing any(sapply(...)) with a second while loop. The advantage is that the while loop can be interrupted prematurely if it finds any superset in the rest of the list.

 l <- list(1:3, 1:4, 2:4, 5:12, double(), 5:13 ); ls <- l[order(sapply(l,length))]; i <- 1; while (i <= length(ls)-1) if (length(ls[[i]]) == 0 || { j <- i+1; res <- F; while (j <= length(ls)) if (length(setdiff(ls[[i]],ls[[j]])) == 0) { res <- T; break; } else j <- j+1; res; }) ls[[i]] <- NULL else i <- i+1; ls; ## [[1]] ## [1] 1 2 3 4 ## ## [[2]] ## [1] 5 6 7 8 9 10 11 12 13

Removing duplicates and small vectors from the list

More articles: