Search for duplicates in a list, including permutations

Question

Search for duplicates in a list, including permutations

I would like to determine if the list contains any duplicate elements, considering permutations as equivalent. All vectors have the same length.

What is the most efficient way (shortest time) to accomplish this?

## SAMPLE DATA a <- c(1, 2, 3) b <- c(4, 5, 6) a.same <- c(3, 1, 2) ## BOTH OF THSE LISTS SHOULD BE FLAGGED AS HAVING DUPLICATES myList1 <- list(a, b, a) myList2 <- list(a, b, a.same) # CHECK FOR DUPLICATES anyDuplicated(myList1) > 0 # TRUE anyDuplicated(myList2) > 0 # FALSE, but would like true.

At the moment, I resort to sorting each member of the list before checking for duplicates

 anyDuplicated( lapply(myList2, sort) ) > 0

I am wondering if there is a more effective alternative. Additionally, the ?duplicated documentation states that "Using this for lists is potentially slow." Are there other features that are most suitable for lists?

+4

list r duplicates

Ricardo saporta Nov 11 '12 at 19:26

source share

3 answers

You can use setequal :

 myList1 <- list(a, b, a) myList2 <- list(a, b, a.same) myList3 <- list(a,b) test1 <- function(mylist) anyDuplicated( lapply(mylist, sort) ) > 0 test1(myList1) #[1] TRUE test1(myList2) #[1] TRUE test1(myList3) #[1] FALSE test2 <- function(mylist) any(combn(length(mylist),2, FUN=function(x) setequal(mylist[[x[1]]],mylist[[x[2]]]))) test2(myList1) #[1] TRUE test2(myList2) #[1] TRUE test2(myList3) #[1] FALSE library(microbenchmark) microbenchmark(test1(myList2),test2(myList2)) #Unit: microseconds # expr min lq median uq max #1 test1(myList2) 142.256 150.9235 154.6060 162.8120 247.351 #2 test2(myList2) 63.306 70.5355 73.8955 79.5685 103.113

+1

Rolling Nov 11 '12 at 19:49

source share

  a=[1,2,3] b=[4,5,6] samea=[3,2,1] list1=list(a+b+a) and list(a+b+sames) both of this will create a list with same element [1, 2, 3, 4, 5, 6, 3, 2, 1] ####so finding duplicate Function def findDup(x): for i in x: if x.count(i)>1: return True return False

-3

raton Nov 11 '12 at 20:02

source share

Jilber urbina · Accepted Answer · 2012-11-11T19:54:55+0000

How about this ...?

 a <- c(1, 2, 3) b <- c(4, 5, 6) a.same <- c(3, 1, 2) myList1 <- list(a, b, a) myList2 <- list(a, b, a.same) # For exact duplicated values: List1 DF1 <- do.call(rbind, myList1) # From list to data.frame ind1 <- apply(DF1, 2, duplicated) # logical matrix for duplicated values DF1[ind1] # finding duplicated values [1] 1 2 3 # For permutations: List2 DF2 <- do.call(rbind, myList2) ind2 <- apply(apply(DF2, 1, sort), 1, duplicated) DF2[ind2] # duplicated values [1] 3 1 2

Search for duplicates in a list, including permutations

More articles: