Cluster index dilemma - identifier or sort?

I have a table with two very important fields:

id INT identity(1,1) PRIMARY KEY identifiersortcode VARCHAR(900) 

My application always sorts and views search results in the user interface based on identifiersortcode , but all table joins (and they are legionnaires) are in the id field. (Also: yes, the sort code is really so long. There is a strong reason for BL.)

In addition, due to the use of O / RM, most SELECT statements will pull almost every column.

Currently, the clustered index is on id , but I wonder if the TOP / ORDER BY part of most queries will make identifiersortcode more attractive option as a clustered key, even considering that all table joins continue.

Insertions into the table and changes to identifiersortcode are so limited that changing my clustered index would be a problem for insert / update operations.

Trying to make the index of a non-clustered sort index an index of coverage (using INCLUDE ) is not a good option. There are several large columns, and some of them have a lot of refresh activity.

+4
source share
5 answers

Currently, the clustered index is on id, but I'm wondering if the TOP / ORDER BY part of most queries will make identifiers more attractive as a clustered key, even if all the tables in it continue.

Creating identifiersortcode a CLUSTERED KEY will only help if it is used both in filtering conditions and when ordering.

This means that it is selected as the leading table in all your connections and uses the Clustered Index Scan or Clustered Index Range Scan access path.

Otherwise, it will only worsen the situation: firstly, all secondary indexes will be larger in size; secondly, inserts in a steady manner will lead to page breaks, which will make them work longer and will lead to an increase in the table.

+1
source

Kimberly L. Tripp criteria for a cluster index are as follows:

  • Unique
  • Narrow
  • Static
  • Ever increasing

Based on this, I would stick to the column with an integer id , which satisfies all of the above. Your identifiersortcode will fail most, if not all, of these requirements.

+4
source

To correctly determine which field will benefit most from the clustered index, you need to do your homework. The first thing you should consider is the selectivity of your associations. If your execution plans filter the rows from this FIRST table and then join the other tables, then it is not profitable for you to have a clustered index on the primary key, and it makes sense to have it on the sort key.

If, however, your joins are selective for other tables (they are filtered out, then an index search is performed to select rows from this table), then you need to compare the manual change performance and the status quo.

+2
source

Why, for God's sake, should your identifier code be 900 characters long? If you really need 900 characters to sort, this should probably be split into multiple fields.

+1
source

Appart from repeating what Chris B. said, I think you really should stick to your current PC, as, as you said, all connections are on Id.
I think you have already indexed the identification code ... However, if you have performance problems, you might think twice about this @ # "% $ Β£ identifiersortcode! -)

+1
source

Source: https://habr.com/ru/post/1335121/


All Articles