You see this problem because you are using sets for your collection type. Sets have two characteristics: they are disordered (which is not important here), and their elements are unique. Thus, you lose duplicates in lists when you convert them to sets, even before you find their intersection:
>>> p = ['1', '2', '3', '3', '3', '3', '3'] >>> set(p) set(['1', '2', '3'])
There are several ways to do what you want to do here, but you'll want to start by looking at the count method of a list. I would do something like this:
>>> list1 = ['a', 'b', 'c'] >>> list2 = ['a', 'b', 'c', 'c', 'c'] >>> results = {} >>> for i in list1: results[i] = list2.count(i) >>> results {'a': 1, 'c': 3, 'b': 1}
With this approach, a dictionary is created ( results ), and for each element in list1 a key is created in results , it is calculated how many times it appears in list2 , and assigned to its key value.
Change: As Lattyware points out, this approach solves a slightly different question than the one you asked. A truly fundamental solution would look like this
>>> words = ['red', 'blue', 'yellow', 'black'] >>> list1 = ['the', 'black', 'dog'] >>> list2 = ['the', 'blue', 'blue', 'dog'] >>> results1 = 0 >>> results2 = 0 >>> for w in words: results1 += list1.count(w) results2 += list2.count(w) >>> results1 1 >>> results2 2
This works the same way, to my first sentence: it iterates through each word in the main list (here I use words ), adds the number of times it appears in list1 to the opposite of results1 and list2 to results2 .
If you need more information than just the number of duplicates, you will want to use a dictionary or, even better, a specialized Counter type in collections modules. The counter is built to simplify everything I did in the examples above.
>>> from collections import Counter >>> results3 = Counter() >>> for w in words: results3[w] = list2.count(w) >>> results3 Counter({'blue': 2, 'black': 0, 'yellow': 0, 'red': 0}) >>> sum(results3.values()) 2