Cycle Etching

I have a custom node class in python that is embedded in a graph (which is a dictionary). Since this takes some time, I would like to sort them so that I donโ€™t have to restore them every time I run my code.

Unfortunately, since this graph has loops, cPickle reaches its maximum recursion depth:

RuntimeError: maximum recursion depth exceeded when etching an object

This is my node object:

class Node: def __init__(self, name): self.name = name self.uid = 0 self.parents = set() self.children = set() def __hash__(self): return hash(self.name) def __eq__(self, that): return self.name == that.name def __str__(self): return "\n".join(["Name: " + self.name, "\tChildren:" + ", ".join([c.name for c in self.children]), "\tParents:" + ", ".join([p.name for p in self.parents]) ] ) 

This is how I build my schedule:

 def buildGraph(input): graph = {} idToNode = {} for line in input: ## Input from text line by line looks like ## source.node -> target.node source, arr, target = line.split() if source in graph: nsource = graph[source] else: nsource = Node(source) nsource.uid = len(graph) graph[source] = nsource idToNode[nsource.uid] = nsource if target in graph: ntarget = graph[target] else: ntarget = Node(target) ntarget.uid = len(graph) graph[target] = ntarget idToNode[ntarget.uid] = ntarget nsource.children.add(ntarget) ntarget.parents.add(nsource) return graph 

Then, basically, I have

  graph = buildGraph(input_file) bo = cPickle.dumps(graph) 

and the second line is where I get the recursion depth error.

Are there any solutions beyond modifying the Node structure?

+6
source share
3 answers

You need to prepare the object for brine: if you have a cycle, you need to break the cycles and save this information in a different form.

Pickle use the __getstate__ methods to prepare the object for pickling (calling it before) and __setstate__ for initializing the object.

 class SomethingPickled(object): ## Compress and uncycle data before pickle. def __getstate__(self): # deep copy object state = self.__dict__.copy() # break cycles state['uncycled'] = self.yourUncycleMethod(state['cycled']) del state['cycle'] # send to pickle return state ## Expand data before unpickle. def __setstate__(self, state): # restore cycles state['cycle'] = self.yourCycleMethod(state['uncycled']) del state['uncycle'] self.__dict__.update(state) 

I am sure you will find an idea how to split and combine loops :)

+2
source

I donโ€™t think the problem is that your graph is cyclical - the problem is that pickle (and cPickle) should handle cyclic data structures just fine. I tried the following (with your Node definition) and it worked fine:

 >>> n1 = Node('a') >>> n2 = Node('b') >>> n1.parents.add(n2) >>> n2.parents.add(n1) >>> n2.children.add(n1) >>> n1.children.add(n1) >>> import cPickle as pickle >>> pickle.dumps(n1) 

Indeed, even with large cycles, I did not encounter a problem. For example, this works fine for me:

 >>> def node_cycle(n): ... start_node = prev_node = Node('node0') ... for i in range(n): ... node = Node('node%d' % (i+1)) ... node.parents.add(prev_node) ... prev_node.children.add(node) ... prev_node = node ... start_node.parents.add(node) ... node.children.add(start_node) >>> cycle = node_cycle(100000) # cycle of 100k nodes >>> pickle.dumps(cycle) 

(all of this has been tested in Python 2.7.1)

There are other reasons why pickling can result in very deep recursion, although, depending on the shape of your data structure. If this is a real problem, you can fix it something like this:

 >>> import sys >>> sys.setrecursionlimit(10000) 
+2
source

Here, this modified node class only contains object names as strings in node and gives you a set with full "Node" objects when retrieving the "children" or "children", parents "of the node attribute.

There are no cycles inside - therefore, he should avoid the infinity cycle trap. You can implement additional helper methods to facilitate navigation as you see fit.

 class Node(object): all_nodes = {} def __new__(cls, name): self = object.__new__(cls) cls.all_nodes[name] = self return self def __getstate__(self): self.all_nodes = self.__class__.all_nodes return self.__dict__ def __setstate__(self, dct): self.__class__.all_nodes = dct["all_nodes"] del dct["all_nodes"] self.__dict__ = dct def __init__(self, name): #self.all_nodes = self.__class__.all_nodes self.name = name self.uid = 0 self._parents = set() self._children = set() def __hash__(self): return hash(self.name) def __eq__(self, that): return self.name == that.name def __repr__(self): return "\n" + "\n".join(["Name: " + self.name, "\tChildren:" + ", ".join([c.name for c in self.children]), "\tParents:" + ", ".join([p.name for p in self.parents]) ] ) def get_relations(self, which): names = getattr(self, which) return set(self.__class__.all_nodes[name] for name in names) @property def children(self): return self.get_relations("_children") @property def parents(self): return self.get_relations("_parents") def __contains__(self, item): return item.name in self._children def add(self, child): self._children.add(child.name) child._parents.add(self.name) connect_child = add #example and testing: from cPickle import loads, dumps n1 = Node("n1") n2 = Node("n2") n3 = Node("n3") n1.add(n2) n2.add(n3) n3.add(n1) print n1, n2, n3 p1 = dumps(n1) Node.all_nodes.clear() p2 = loads(p1) print p2 print p2.children print p2.children.pop().children print Node.all_nodes 

The disadvantage is that it maintains a class dictionary named "all_nodes", where there are links to all the nodes that are actually created. (Pickle is smart enough to only sort this dictionary once for a given graph, since all node objects refer to it). The problem with the "all_nodes" public link is that you need to disassemble and disassemble different sets of graphs. 9 years, let's say you create graphs g1 with a set of nodes, in another run you create a graph g2 with a different set of nodes, and if you decompose g1 and later g2, scattering g2 will override node links for g1). If you need this to work, ask in the comments, and I could come up with something - for the convenience I can come up with, there is a โ€œgraphโ€ class that will contain a dictionary for all nodes (instead of having it in node class)

+1
source

Source: https://habr.com/ru/post/909116/


All Articles