Implementing a search function in a tree structure with many different types of nodes

Question

Implementing a search function in a tree structure with many different types of nodes

I have a tree structure consisting of dozens of node types (each node type inherits from the NodeBase class).

I would like to search the tree to return a link to a specific node. Suppose, for example, that there is some Company tree that contains Department nodes among other types of nodes. Department nodes are made up of Employee nodes. It is assumed that the employee must be part of a department and may be in the same department.

It is currently designed so that each node has a list of child nodes of type NodeBase . A tree can become quite large, with hundreds of thousands of nodes at times. Insert / delete operations are rarely used, while search operations should not take too much time for these large trees.

Suppose I want to get a link to an employee node whose field employee ID is equal to some string that I provide. I do not know in which department the employee works, so I will need to search through all the nodes, hoping to find a match. Not all nodes have an employee ID field; departments, for example, do not have them.

I am not sure what the best way to implement the search function, given this tree structure design.

There are probably better ways to develop how the data is stored in the first place (for example: using a database?), But I'm currently stuck in a tree.

+4

java design

Mxyl Jun 13 '13 at 20:11

source share

3 answers

Data structures are a way to organize your data, and how you organize your data depends on how you actually use these pieces of information.

A tree is the right data structure for answering questions such as “get all descendants of node X”, but it won’t help you solve the problem of “find me an object with property X set to Y” (at least not your tree: you can Of course, use the tree inside to save the sorted index, as I will explain later).

Therefore, I believe that the best way to solve this problem is to use two separate data structures for organizing data: a tree of NodeBase objects to reflect the hierarchical relationship between NodeBase's and a sorted index to perform searches using decent performance. This will lead to a synchronization problem, though, since you will have to keep two data structures in sync when nodes are added / removed. If this does not happen too often or just search performance is critical, then this may be the right way.

+2

Raffaele Jun 13 '13 at 20:21

source share

There seem to be two parts to this question: decomposition of the class hierarchy and implementation of the search algorithm.

In the Java world, there are two possible solutions to the decomposition problem :

Local-oriented object-oriented decomposition, and
Decomposition of type checking using instanceof and type casting .

Functional languages (including Scala) offer pattern matching, which is actually the best approach to implement validation type decomposition.

Due to the fact that it is necessary to work with a data structure (tree), where elements (nodes) can be of different types, the character of decomposition is definitely not local. Thus, the second approach is really the only option.

The search itself can be implemented using, for example, a binary search tree algorithm. Such a tree will need to be built from your data, where the decision on the location of a particular node should depend on the actual search criteria. Basically, this means that you need to have as many trees as there are different search criteria, which is essentially a way to build indexes. Database engines use more complex structures than a binary search tree. For example, red-black trees , but the idea is very similar.

By the way, the binary search tree will be homogeneous. For example, if the search refers to Employee in the Department , then the search tree will consist only of nodes associated with instances of Employee . This fixes the decomposition problem.

+1

01es Jun 13 '13 at 20:25

source share

darijan · Accepted Answer · 2013-06-13T20:17:05+0000

Assuming your tree is a DAG (directed acyclic tree), use for example DFS or BFS. Here's a simple BFS:

 public NodeBase findEmployee (NodeBase root, Integer employeeId) { Queue<NodeBase> q= new LinkedList<NodeBase>(); q.add(root); while (!q.isEmpty()) { NodeBase node= q.poll(); if (node instanceof Employee) { if (((Employee)node).getId().equals(employeeId)) return node; } for (NodeBase child : node.getChildren()) q.add(child); } } }

EDIT: Visitor Template

Or like Brabster , you can use the visitor template. A NodeBase must implement the accept(IVisitor visitor) method accept(IVisitor visitor) :

 public class NodeBase { //your code public void accept(IVisitor visitor) { visitor.visit(this); for (NodeBase node : getChildren()) { node.accept(visitor); } } }

IVisitor is just intercace:

 public interface IVisitor { public void visit(NodeBase node); }

And you need the correct implementation that will do the search:

 public class SearchVisitor implements IVisitor { private Integer searchId; public SearchVisitor(Integer searchId) { this.searchId= searchId; } @Override public void visit(NodeBase node) { if (node instanceof Employee) { if (((Employee)node).getId().equals(searchId)) { System.out.println("Found the node " + node.toString() + "!"); } } } }

Now just name it:

 NodeBase root= getRoot(); root.accept(new SearchVisitor(getSearchId()));

Implementing a search function in a tree structure with many different types of nodes

More articles: