Two functions, one generator

I have two functions that accept iterators as inputs. Is there a way to write a generator that I can provide for both functions as input, which doesn’t require a resetsecond pass? I want to make one pass on the data, but put the output on two functions: Example:

def my_generator(data):
    for row in data:
        yield row

gen = my_generator(data)
func1(gen)
func2(gen)

I know that I can have two different instances of the generator or resetbetween functions, but I was wondering if there was a way to avoid doing two data passes. Please note that func1 / func2 by themselves are NOT generators, which would be nice, because I could have a pipeline.

Here you need to try to avoid the second pass according to the data.

+4
source share
3

, reset func2. , 2 , , .

, itertools.tee, 2 , . , , .

, func1 func2.

for a in gen:
   f1(a)
   f2(a)

, , / .

+3

Python . , itertools:

import itertools

def my_generator(data):
    for row in data:
        yield row

gen = my_generator(data)
gen1, gen2 = itertools.tee(gen)
func1(gen1)
func2(gen2)

, func1 func2 , itertools.tee() gen , gen2 .

, . func1 func2. , , func1 , func2.

+3

If the use of threads is an option, the generator can be consumed only once, without preserving, possibly, an unpredictable number of received values ​​between calls to consumers. The following example starts users in lock mode; This implementation requires Python 3.2 or later:

import threading


def generator():
    for x in range(10):
        print('generating {}'.format(x))
        yield x


def tee(iterable, n=2):
    barrier = threading.Barrier(n)
    state = dict(value=None, stop_iteration=False)

    def repeat():
        while True:
            if barrier.wait() == 0:
                try:
                    state.update(value=next(iterable))
                except StopIteration:
                    state.update(stop_iteration=True)
            barrier.wait()
            if state['stop_iteration']:
                break
            yield state['value']

    return tuple(repeat() for i in range(n))


def func1(iterable):
    for x in iterable:
        print('func1 consuming {}'.format(x))


def func2(iterable):
    for x in iterable:
        print('func2 consuming {}'.format(x))


gen1, gen2 = tee(generator(), 2)

thread1 = threading.Thread(target=func1, args=(gen1,))
thread1.start()

thread2 = threading.Thread(target=func2, args=(gen2,))
thread2.start()

thread1.join()
thread2.join()
+3
source

Source: https://habr.com/ru/post/1628734/


All Articles