What is the temporary difficulty of traversing a tree using a crop?

Example of traversing a tree in depth:

class Node: def __init__(self, value): self._value = value self._children = [] def add_child(self, child): self._children.append(child) def __iter__(self): return iter(self._children) def depth_first(self): yield self for c in self: yield from c.depth_first() 

I understand that yield from does not consume the generator immediately, but instead pass yield up to its caller.

But this transfer process still exists, and thus yield will be passed from every node to its root, and we can describe the execution time by repeating (suppose this is a binary tree for simplicity, but the idea is the same):

T (n) = 2 * T (n / 2) + Θ (n)

Θ(n) exists because this node must pass all the yield passed from its descendants to the parent. And the time complexity obtained by the formula above:

About (NlogN)

However, the time complexity of tree traversal is only O(n) , if I do not use yield or yield from .

I am wondering if I don’t understand how yield works, or is it just impossible to write a recursive generator like this.

+5
source share
1 answer

From the official Python 3.3 release for yield from : https://www.python.org/dev/peps/pep-0380/

Using specialized syntax opens up possibilities for optimization when there is a long chain of generators. Such chains can occur because the instance, during the recursive movement of the tree structure. The overhead of sending the following () calls and getting the values ​​down and up the chain can lead to what should be an O (n) operation in order to become, in the worst case, O (n ** 2).

A possible strategy is to add a slot to the generator for the objects for which the generator is being transmitted. When the next () or send () is executed on the generator, this slot is first checked, and if it is not empty, the generator that it refers to is resumed instead. If it calls StopIteration, the slot is cleared and the main generator resumes.

This will reduce delegation overhead to a chain of calls to C functions not related to Python code execution. a possible reinforcement would be to move the entire chain of generators in a loop and directly resume one at the end, although StopIteration processing is more complicated then.

It appears that yield from still requires a tree walk. But this move is done by the interpreter in C, not Python. So technically this is still O (n) overhead, but it's not as bad as it sounds.

+1
source

Source: https://habr.com/ru/post/1261094/


All Articles