Indexes and Nested Sets

I use a nested set to represent a hierarchy in my application, and I wonder where it is best to place indexes (clustered or others). I am using Microsoft SQL Server 2008.

Operations:

  • About 40 times a day, a new hierarchy will be added immediately from the root.
  • Hierarchies will probably never be deleted.
  • Hierarchies are often available throughout the day using parentId to populate combined fields in stages.
  • Hierarchies move very rarely. Perhaps not even once a month.
  • The largest access is through the left and right when connecting to other tables. This is by far the most frequent access to the hierarchy.

I played with a clustered index left and right (in most cases it will be requested using val BETWEEN @left AND @right . But clusters left and right the correct path for this?

Many thanks in advance to anyone who has more experience with SQL indexes than I do!

Circuit as it is

 _id INT IDENTITY NOT NULL _idParent INT IDENTITY NULL _name NVARCHAR(64) _left INT NOT NULL _right INT NOT NULL 
+6
source share
2 answers

It is best to check out the various index configurations and see which ones work best. At first glance though, clustered on lft and rgt would seem to be better. It seems that there is not much DML in the table, so it does not need to reorder the data often, and the clustered index on lft and rgt should turn most of your queries into clustered index scan / scan.

The only drawback I see is that if you put the hierarchies right under the root, this can include moving many other hierarchies. Will you always add the "right" side of the root? This would only be due to updating the rgt column in the root row, which would be nice. If you add in the middle of the left side of the root, you will have to navigate all other hierarchies to the right of the new one. Also, how big is your table? It will affect things. If it is small enough, then the offset of these hierarchies may not be important. You definitely want to try pasting on the right side of the root if you can, though.

EDIT: One more thing ... did you learn the SQL Server hierarchyid built-in data type?

+1
source

I would be nervous about using a clustered index in this way - it is very fast for queries, but any inserts / updates / deletes may require re-writing the data to disk; this can lead to serious performance problems (especially for large tables).

I would also suggest that in practice, you are unlikely to notice the difference between a clustered and a non-clustered index - especially with integer indices. If you have gigantic tables β€” hundreds of millions of records β€” you can measure the difference, but on modern hardware, I don’t think it makes any noticeable difference.

SO, I would agree with Tom H. - try and measure. Make sure you measure insert / update / delete as well as query. If you do not notice significant performance benefits, I would be inclined to use a clustered index for the primary key, because by definition it is unchanged.

0
source

Source: https://habr.com/ru/post/892034/


All Articles