SQL - How to store and navigate hierarchies?

What methods do you use to model and obtain hierarchical information in a database?

+44
sql oracle sql-server database-design hierarchy
Sep 02 '08 at 4:10
source share
9 answers

The final works on this subject were written by Joe Celco, and he worked on several of them in a book called Joe Selco's Trees and Hierarchies in SQL for Smarties.

He favors a method called directional graphs. An introduction to his work on this subject can be found here.

+13
Sep 07 '08 at 10:33
source share
โ€” -

I like the tree traversal algorithm with a modified pre-order. This method queries the tree very easily.

But here is a list of links about a topic that I copied from the Zend Framework (PHP) developers web page (published there from Laurent Melmoux on 06/05/2007 15:52).

Many of the links are language agnostics:

There are two main representations and algorithms for representing hierarchical structures with databases:

  • a nested set, also known as a modified pre-order tree traversal algorithm
  • adjacency list model

This is well explained here:

Here are some more links that I have compiled:

adjacency list model

nested set

in the form of graphs

Classes:

Adodb wood tree nested sets

ADOdb Visit Model

PEAR :: DB_NestedSet

PEAR :: Tree

nstrees

+29
Sep 19 '08 at 5:46
source share

What is the best way to represent a hierarchy in an SQL database? General, portable equipment?

Suppose a hierarchy is mostly readable, but not completely static. Let's say this is a family tree.

Here's how to do it:

create table person ( person_id integer autoincrement primary key, name varchar(255) not null, dob date, mother integer, father integer ); 

And inserting such data:

 person_id name dob mother father 1 Pops 1900/1/1 null null 2 Grandma 1903/2/4 null null 3 Dad 1925/4/2 2 1 4 Uncle Kev 1927/3/3 2 1 5 Cuz Dave 1953/7/8 null 4 6 Billy 1954/8/1 null 3 

Instead, split the nodes and your relationships into two tables.

 create table person ( person_id integer autoincrement primary key, name varchar(255) not null, dob date ); create table ancestor ( ancestor_id integer, descendant_id integer, distance integer ); 

Data is created as follows:

 person_id name dob 1 Pops 1900/1/1 2 Grandma 1903/2/4 3 Dad 1925/4/2 4 Uncle Kev 1927/3/3 5 Cuz Dave 1953/7/8 6 Billy 1954/8/1 ancestor_id descendant_id distance 1 1 0 2 2 0 3 3 0 4 4 0 5 5 0 6 6 0 1 3 1 2 3 1 1 4 1 2 4 1 1 5 2 2 5 2 4 5 1 1 6 2 2 6 2 3 6 1 

Now you can run arbitrary queries that are not related to joining the table to yourself, which will happen if you have a heirachy relation on the same line as node.

Who has grandparents?

 select * from person where person_id in (select descendant_id from ancestor where distance=2); 

All your descendants:

 select * from person where person_id in (select descendant_id from ancestor where ancestor_id=1 and distance>0); 

Who are the guys?

 select decendant_id uncle from ancestor where distance=1 and ancestor_id in (select ancestor_id from ancestor where distance=2 and not exists (select ancestor_id from ancestor where distance=1 and ancestor_id=uncle) ) 

You avoid all the problems with joining a table to yourself through subqueries, the general limitation is 16 subsuqeries.

The problem is that saving the ancestor table is very difficult - it is best to execute the stored procedure.

+9
Sep 02 '08 at 4:13
source share

I do not agree with Josh. What happens if you use a huge hierarchical structure, such as a company organization. People can join / leave the company, change reporting lines, etc. Maintaining "distance" would be a big problem, and you would have to maintain two data tables.

This query (SQL Server 2005 and higher) will allow you to see the full row of any person AND calculates their place in the hierarchy, and this requires only one table of user information. It can be changed to find any child relationships.

 --Create table of dummy data create table #person ( personID integer IDENTITY(1,1) NOT NULL, name varchar(255) not null, dob date, father integer ); INSERT INTO #person(name,dob,father)Values('Pops','1900/1/1',NULL); INSERT INTO #person(name,dob,father)Values('Grandma','1903/2/4',null); INSERT INTO #person(name,dob,father)Values('Dad','1925/4/2',1); INSERT INTO #person(name,dob,father)Values('Uncle Kev','1927/3/3',1); INSERT INTO #person(name,dob,father)Values('Cuz Dave','1953/7/8',4); INSERT INTO #person(name,dob,father)Values('Billy','1954/8/1',3); DECLARE @OldestPerson INT; SET @OldestPerson = 1; -- Set this value to the ID of the oldest person in the family WITH PersonHierarchy (personID,Name,dob,father, HierarchyLevel) AS ( SELECT personID ,Name ,dob ,father, 1 as HierarchyLevel FROM #person WHERE personID = @OldestPerson UNION ALL SELECT e.personID, e.Name, e.dob, e.father, eh.HierarchyLevel + 1 AS HierarchyLevel FROM #person e INNER JOIN PersonHierarchy eh ON e.father = eh.personID ) SELECT * FROM PersonHierarchy ORDER BY HierarchyLevel, father; DROP TABLE #person; 
+9
Sep 02 '08 at 5:20
source share

FYI: SQL Server 2008 introduces a new HierarchyID data type for this kind of situation. Allows you to control where in the "tree" your row sits, both horizontally and vertically.

+4
Sep 02 '08 at 4:15
source share

Oracle: SELECT ... START WITH ... CONNECT BY

Oracle has a SELECT extension that makes it easy to retrieve tree-based data. Perhaps SQL Server has a similar extension?

This query will move around the table in which the nesting relationships are stored in the parent and child columns.

 select * from my_table start with parent = :TOP connect by prior child = parent; 

http://www.adp-gmbh.ch/ora/sql/connect_by.html

+3
Sep 02 '08 at 4:18
source share

I prefer the combination of techniques that Josh and Mark Harrison used:

Two tables, one with Person data and the other with hierarchical information (person_id, parent_id [, mother_id]), if the PK of this table is person_id, you have a simple tree with one parent node (which makes sense in this case, but not in the other cases such as accounts)

This hiarchy table may be crossed out by recursive procedures, or if your database supports it with statements such as SELECT ... BY PRIOR (Oracle).

Another possibility - if you know that the maximum depth of the hierarchy data that you want to use is to use a single table with a set of columns per hierarchy level

+2
Sep 02 '08 at 6:54
source share

We had the same problem when we implemented the tree component for [fleXive] and used the nested tree model set mentioned by the Tarkun from MySQL .

In addition to speed (sharply) up, we used an advanced approach, which simply means that we used the maximum long value for the upper levels of rights, which allows us to insert and move nodes without recounting all left and right values. The values โ€‹โ€‹for the left and right are calculated by dividing the range for the node by 3 and using the inner third as the boundaries for the new node.

Sample Java code can be seen here .

+1
Nov 24 '08 at 13:16
source share

If you are using SQL Server 2005, this link explains how to retrieve hierarchical data.

Common Table Expressions (CTEs) can be your friends when you are comfortable using them.

0
Sep 02 '08 at 4:18
source share



All Articles