Matching two lists without loops

Question

Matching two lists without loops

I have two lists of the same length. The first list l1 contains data.

 l1 = [2, 3, 5, 7, 8, 10, ... , 23]

The second list l2 contains a category in which data in l1 belongs to:

 l2 = [1, 1, 2, 1, 3, 4, ... , 3]

How can I split the first list based on positions defined by numbers, such as 1, 2, 3, 4 in the second list, using list comprehension or lambda function. For example, 2, 3, 7 from the first list refers to the same section as the corresponding values in the second list.

The number of sections is known at the beginning.

+5

python

Santosh linkha Apr 24 '16 at 11:25

source share

7 answers

If a dict is ok, I suggest using defaultdict :

 >>> from collections import defaultdict >>> d = defaultdict(list) >>> for number, category in zip(l1, l2): ... d[category].append(number) ... >>> d defaultdict(<type 'list'>, {1: [2, 3, 7], 2: [5], 3: [8, 23], 4: [10]})

Use itertools.izip consider using memory if you are using Python 2.

This is basically the same solution as Kasramvd, but I think defaultdict makes it a little easier to read.

+8

timgeb Apr 24 '16 at 11:31

source share

This will give a list of sections using list comprehension:

 >>> l1 = [2, 3, 5, 7, 8, 10, 23] >>> l2 = [1, 1, 2, 1, 3, 4, 3] >>> [[value for i, value in enumerate(l1) if j == l2[i]] for j in set(l2)] [[2, 3, 7], [5], [8, 23], [10]]

+2

cromod Apr 24 '16 at 11:56

source share

Nested list comprehension:

[ [ l1[j] for j in range(len(l1)) if l2[j] == i ] for i in range(1, max(l2)+1 )]

+1

nino_701 Apr 24 '16 at 13:23

source share

If it is reasonable to store your data in numpy ndarrays, you can use advanced indexing

 {i:l1[l2==i] for i in set(l2)}

build a ndarrays dictionary indexed by category code.

There is overhead information related to l2==i (i.e. building a new logical array for each category) that grows with the number of categories, so you can check which alternative - numpy or defaultdict is faster with your data.

I tested with n=200000 , nc=20 and numpy was faster than defaultdict + izip (124 vs 165 ms), but with nc=10000 numpy was (much) slower (11300 vs 251 ms)

+1

gboffi Apr 24 '16 at 13:36

source share

Using some itertools and operator goodies and sorting you can do this in one liner:

 >>> l1 = [2, 3, 5, 7, 8, 10, 23] >>> l2 = [1, 1, 2, 1, 3, 4, 3] >>> itertools.groupby(sorted(zip(l2, l1)), operator.itemgetter(0))

The result of this is the itertools.groupby object, which can be repeated:

 >>> for g, li in itertools.groupby(sorted(zip(l2, l1)), operator.itemgetter(0)): >>> print(g, list(map(operator.itemgetter(1), li))) 1 [2, 3, 7] 2 [5] 3 [8, 23] 4 [10]

+1

user1556435 Apr 24 '16 at 13:44

source share

This is not a list comprehension, but a dictionary comprehension. It is similar to @cromod's solution, but retains the “categories” from l2 :

 {k:[val for i, val in enumerate(l1) if k == l2[i]] for k in set(l2)}

Output:

 >>> l1 [2, 3, 5, 7, 8, 10, 23] >>> l2 [1, 1, 2, 1, 3, 4, 3] >>> {k:[val for i, val in enumerate(l1) if k == l2[i]] for k in set(l2)} {1: [2, 3, 7], 2: [5], 3: [8, 23], 4: [10]} >>>

+1

jDo Apr 24 '16 at 15:04

source share

Kasramvd · Accepted Answer · 2016-04-24T11:28:40+0000

You can use the dictionary:

 >>> l1 = [2, 3, 5, 7, 8, 10, 23] >>> l2 = [1, 1, 2, 1, 3, 4, 3] >>> d = {} >>> for i, j in zip(l1, l2): ... d.setdefault(j, []).append(i) ... >>> >>> d {1: [2, 3, 7], 2: [5], 3: [8, 23], 4: [10]}

Matching two lists without loops

More articles: