I have a list of lists like this
matches = [[['rootrank', 'Root'], ['domain', 'Bacteria'], ['phylum', 'Firmicutes'], ['class', 'Clostridia'], ['order', 'Clostridiales'], ['family', 'Lachnospiraceae'], ['genus', 'Lachnospira']], [['rootrank', 'Root'], ['domain', 'Bacteria'], ['phylum', '"Proteobacteria"'], ['class', 'Gammaproteobacteria'], ['order', '"Vibrionales"'], ['family', 'Vibrionaceae'], ['genus', 'Catenococcus']], [['rootrank', 'Root'], ['domain', 'Archaea'], ['phylum', '"Euryarchaeota"'], ['class', '"Methanomicrobia"'], ['order', 'Methanomicrobiales'], ['family', 'Methanomicrobiaceae'], ['genus', 'Methanoplanus']]]
And I want to build a phylogenetic tree from them. I wrote a node class similar to this (partially based on this code ):
class Node(object): """Generic n-ary tree node object Children are additive; no provision for deleting them.""" def __init__(self, parent, category=None, name=None): self.parent = parent self.category = category self.name = name self.childList = [] if parent is None: self.birthOrder = 0 else: self.birthOrder = len(parent.childList) parent.childList.append(self) def fullPath(self): """Returns a list of children from root to self""" result = [] parent = self.parent kid = self while parent: result.insert(0, kid) parent, kid = parent.parent, parent return result def ID(self): return '{0}|{1}'.format(self.category, self.name)
And then I try to build my tree like this:
node = None for match in matches: for branch in match: category, name = branch node = Node(node, category, name) print [n.ID() for n in node.fullPath()]
This works for the first match, but when I start with the second match, it is added at the end of the tree, rather than starting again at the top. How should I do it? I tried several search options for an identifier, but I cannot get it to work.