Iterate through multiple sorted lists in order

Suppose I have several lists of pairs (int, str), not necessarily the same length. The only limitation here is that lists are sorted in ascending order by their integer parts:

a = [(1, 'a'), (4, 'a'), (6, 'b'), (7, 'c'), (12, 'a')] b = [(5, 'd'), (10, 'c'), (11,'e')] c = [(0, 'b'), (3, 'd')] 

What I would like to do is to emit the elements of the string in the order in which their corresponding integer elements meet, i.e. in this case:

 (0, 'b'), (1, 'a'), (3, 'd'), (4, 'a'), ... 

I am wondering if there is an obvious (good + pythonic) way to do this, simply by using iterators a , b and c ? I looked at itertools but cannot immediately see how to use the functionality in this case. Lists a , b , c can be very large, so I would like to do this without reading them in memory, and then sorting ...

+6
source share
3 answers

Since the lists are already sorted, you can use heapq.merge :

 >>> import heapq >>> a = [(1, 'a'), (4, 'a'), (6, 'b'), (7, 'c'), (12, 'a')] >>> b = [(5, 'd'), (10, 'c'), (11,'e')] >>> c = [(0, 'b'), (3, 'd')] >>> for i in heapq.merge(a, b, c): ... i ... (0, 'b') (1, 'a') (3, 'd') (4, 'a') (5, 'd') (6, 'b') (7, 'c') (10, 'c') (11, 'e') (12, 'a') >>> 

It is also very efficient with large lists, as it uses iterators inside. From the document link above:

Similar to sorted(itertools.chain(*iterables)) , but returns iterable, does not retrieve data into memory immediately , and assumes that each of the input streams is already sorted (smallest largest).

+13
source
 my_iterator = iter(sorted(a+b+c)) 

is by far the most pythonic imho (although you can probably just leave it as a list and not wrap an additional iter

could you speed it up if that was a bottleneck (that I doubt it)

+4
source

heapq.merge is probably the best choice. FWIW more_itertools also offers a mergesort tool similar to the accepted answer:

 import operator as op import more_itertools list(more_itertools.collate(a, b, c, key=op.itemgetter(0))) 

Output

 [(0, 'b'), (1, 'a'), (3, 'd'), (4, 'a'), (5, 'd'), (6, 'b'), (7, 'c'), (10, 'c'), (11, 'e'), (12, 'a')] 

See more_itertools docs for more details.

0
source

Source: https://habr.com/ru/post/980382/


All Articles