Removing one list from another

Question

Removing one list from another

In python (2.7) we can do:

>>> a = [1, 2, 3] >>> b = [4 , 5] >>> a + b [1, 2, 3, 4, 5]

However, we cannot do a - b.

Since python seems to have something cool for almost everything, which in your opinion is the hardest for python-aque?

A similar question is for dictionaries that can neither use a + b nor ab, where a and b are both dictionaries. Thanks.

+6

python

dublintech Jan 29 '12 at 12:47

source share

8 answers

Rob wouters · Answer 1 · 2012-01-29T12:52:44+0000

You can do this with sets:

 >>> s = set([1,2,3] + [4,5]) >>> s - set([4, 5]) {1, 2, 3}

The main difference, of course, is a set, cannot contain repeating elements.

Rafał Rawicki · Answer 2 · 2012-01-29T12:52:51+0000

I would do:

 >>> a = [1, 2, 3] >>> b = [2, 3] >>> filter(lambda x: x not in b, a) [1]

or using lists

 [x for x in a if x not in b]

And it can be done the same for dictionaries.

Set defined the operator - and the difference and symmetric_difference methods. If you plan to use these operations extensively, use a list or dict instead.

phimuemue · Answer 3 · 2012-01-29T12:56:06+0000

I would try [x for x in a if a not in b] .

NPE · Answer 4 · 2012-01-29T12:53:38+0000

The answer depends on the desired semantics a - b .

If you need only the first elements, then slicing is the natural way to do this:

 In [11]: a = [1, 2, 3] In [12]: b = [4 , 5] In [13]: ab = a + b In [14]: ab[:len(a)] Out[14]: [1, 2, 3]

If, on the other hand, you want to remove items from the first list that are not found in the second list:

 In [15]: [v for v in ab if v not in b] Out[15]: [1, 2, 3]

The second type of operation is more naturally expressed using sets:

 In [18]: set(ab) - set(b) Out[18]: set([1, 2, 3])

Note that in general this does not preserve the order of the elements (since the set is unordered). If ordering is important, and b is likely to be long, converting b to a set can improve performance:

 In [19]: bset = set(b) In [20]: [v for v in ab if v not in bset] Out[20]: [1, 2, 3]

For dictionaries, an “add” operation in place already exists. It is called dict.update() .

Grady · Answer 5 · 2012-01-29T12:54:28+0000

y = set(b)
aminusb = filter(lambda p: p not in y,a)

Óscar López · Answer 6 · 2012-01-29T13:17:52+0000

Try the following:

 def list_sub(lst1, lst2): s = set(lst2) return [x for x in lst1 if x not in s] list_sub([1, 2, 3, 1, 2, 1, 5], [1, 2]) > [3, 5]

This solution is O(n+m) due to the fact that it uses a precomputed set , so membership search will be fast. In addition, it will keep the order of the original elements and remove duplicates.

hughdbrown · Answer 7 · 2012-01-29T14:37:20+0000

The order is not saved, but it has the desired result:

 >>> def list_diff(a, b): ... return list(set(a) - set(b)) ... >>> print list_diff([1, 2, 3, 1, 2, 1], [1, 2]) [3]

hashmuke · Answer 8 · 2014-12-11T15:31:31+0000

Here are my preferred options, one of which involves using a different conversion for installation for the loop. In the case of a small list size for a loop, it is permissible, as can be seen from a list of size 10

 In [65]: d1 = range(10) In [66]: d2 = range(1) In [67]: %timeit [x for x in d1 if x not in d2] 1000000 loops, best of 3: 827 ns per loop In [68]: %timeit list(set(d1)-set(d2)) 1000000 loops, best of 3: 1.25 µs per loop

However, if the size of the list is large enough, you should probably use set,

 In [69]: d1 = range(10000) In [70]: d2 = range(1000) In [71]: %timeit [x for x in d1 if x not in d2] 10 loops, best of 3: 105 ms per loop In [72]: %timeit list(set(d1)-set(d2)) 1000 loops, best of 3: 566 µs per loop

Removing one list from another

More articles: