Inconsistent behavior of python generators

The following python code produces [(0, 0), (0, 7) ... (0, 693)] instead of the expected list of tuples combining all multiples of 3 and multiples of 7:

multiples_of_3 = (i*3 for i in range(100)) multiples_of_7 = (i*7 for i in range(100)) list((i,j) for i in multiples_of_3 for j in multiples_of_7) 

This code fixes the problem:

 list((i,j) for i in (i*3 for i in range(100)) for j in (i*7 for i in range(100))) 

Questions:

  • The generator object seems to play the role of an iterator instead of providing an iterator object each time the generated list should be enumerated. A later strategy seems to be adopted by .Net LINQ request objects. Is there an elegant way around this?
  • How does the second part of the code work? Should I understand that the generator iterator is not reset after the loop through all multiple of 7?
  • Don't you think this behavior is intuitive, if not contradictory?
+4
source share
4 answers

As you have discovered, the object created by the generator expression is an iterator (more precisely, an iterator-generator) intended to be used only once. If you need a resettable generator, just create a real generator and use it in loops:

 def multiples_of_3(): # generator for i in range(100): yield i * 3 def multiples_of_7(): # generator for i in range(100): yield i * 7 list((i,j) for i in multiples_of_3() for j in multiples_of_7()) 

The second code works because the list of expressions of the inner loop ( (i*7 ...) ) is evaluated on each pass of the outer loop. This leads to the creation of a new iterator generator each time, which gives you the behavior you want, but at the cost of code clarity.

To understand what is happening, remember that there is no “reload” of the iterator when the for loop iterates over it. (This is a function, so reset aborts the iteration over the large iterator in parts, and this is not possible for generators.) For example:

 multiples_of_2 = iter(xrange(0, 100, 2)) # iterator for i in multiples_of_2: print i # prints nothing because the iterator is spent for i in multiples_of_2: print i 

... by contrast:

 multiples_of_2 = xrange(0, 100, 2) # iterable sequence, converted to iterator for i in multiples_of_2: print i # prints again because a new iterator gets created for i in multiples_of_2: print i 

The generation expression is equivalent to the called generator and therefore can only be repeated at a time.

+2
source

A generator object is an iterator and therefore a one-time one. It is not iterable, which can create any number of independent iterators. This behavior is not something that you can change using the switch somewhere, so any work around can be either using iteration (like a list), and not with a generator or repeatedly constructing generators.

The second fragment does the last. This is by definition equivalent to loops

 for i in (i*3 for i in range(100)): for j in (i*7 for i in range(100)): ... 

I hope it is not surprising that here the last expression of the generator is re-evaluated at each iteration of the outer loop.

+3
source

If you want to convert the expression of the generator to a multi-pass iterative, then this can be done in a rather usual way. For instance:

 class MultiPass(object): def __init__(self, initfunc): self.initfunc = initfunc def __iter__(self): return self.initfunc() multiples_of_3 = MultiPass(lambda: (i*3 for i in range(20))) multiples_of_7 = MultiPass(lambda: (i*7 for i in range(20))) print list((i,j) for i in multiples_of_3 for j in multiples_of_7) 

In terms of defining things, this is a similar amount of work to enter:

 def multiples_of_3(): return (i*3 for i in range(20)) 

but from the user's point of view they write multiples_of_3 , not multiples_of_3() , which means that the multiples_of_3 object is polymorphic with any other iterable, like tuple or list .

Need to input lambda: little inelegant, true. I don’t think it would be harmful to introduce “iterable understandings” into the language to give you what you want, while maintaining backward compatibility. But there are only so many punctuation symbols, and I doubt it will be considered worthy.

+1
source

The real problem that I learned about concerns single and multiple repeated iterations, as well as the fact that there is currently no standard mechanism for determining whether iterative single or multi-pass: See One-way iterative multi-pass

0
source

Source: https://habr.com/ru/post/1499360/


All Articles