Removing duplicates from a nested list based on the first 2 items

Question

I am trying to remove duplicates from a nested list only if the first 2 elements are the same, ignoring the third ...

List:

L = [['el1','el2','value1'], ['el3','el4','value2'], ['el1','el2','value2'], ['el1','el5','value3']]

Will return:

 L = [['el3','el4','value2'], ['el1','el2','value2'], ['el1','el5','value3']]

I found a simple way to do the same here :

 dict((x[0], x) for x in L).values()

but this only works for the first element, not for the first 2, but this is exactly what I want otherwise.

+4

john Oct 15 '12 at 19:50

3 answers

this should do it:

 In [55]: dict((tuple(x[:2]), x) for x in L).values() Out[55]: [['el1', 'el2', 'value2'], ['el1', 'el5', 'value3'], ['el3', 'el4', 'value2']]

+2

Ashwini chaudhary Oct 15 '12 at 19:53

If order matters, use set with only the first two elements of your nested lists:

 seen = set() seen_add = seen.add return [x for x in seq if tuple(x[:2]) not in seen and not seen_add(tuple(x[:2]))]

+2

Martijn pieters Oct 15 '12 at 19:53

Andrew Clark · Accepted Answer · 2012-10-15T19:52:00+0000

If the order doesn't matter, you can use the same method, but using a tuple of the first and second elements as a key:

 dict(((x[0], x[1]), x) for x in L).values()

Or in Python 2.7 and higher:

 {(x[0], x[1]): x for x in L}.values()

Instead of (x[0], x[1]) you can use tuple(x[:2]) , use what you find more readable.