My question is, if a non-clustered index indexes more columns than the query uses, does that lead to a slower query execution than an index that exactly matches the query?
No, having more columns does not slow down the query time for queries that use the first 1, 2, n columns in the index. At the same time, if you are limited in memory, loading the index into memory can force other things out of memory and slow down the query, but if you have a lot of memory, this should not be a problem.
As the number of different queries increases, the number of column permutations used in their WHERE clauses. I'm not sure of the tradeoffs between multiple indexes with few columns (one for each query) compared to fewer indexes for more columns.
You must first add the most frequently requested unique fields to the indexes. Fewer indexes with many columns may not give you what you want.
for example, if you have an index with the following columns:
- Columna
- Columnb
- Columnc
- Columnd
- ColumnE
- Columnf
in that order, queries related to ColumnA, ColumnB, ColumnC, ColumnD ... will use the index, but if you just query ColumnE or ColumnF, it will not use the index.
Take a different approach if you have six indexes in one table, each with one column
- Index1 - ColumnA
- Index2 - ColumnB
- Index3 - ColumnC
- Index4 - ColumnD
- Index5 - ColumnE
- Index6 - ColumnF
in this case, only one of these 6 indices will be used for any query.
Also, if the index contains a value that is not very selective, it may not help you. For example, if you have a GENDER column that can contain the following values (Male, Female, and Unknown), then it probably will not help you include this column in the index. When the query is executed, SQL Server can determine that it is not selective enough, and just assume that a full table scan will be faster.
There are many ways to find out which indexes are used by your query, but one approach I use is to look at indexes that are never used. Run the following query in your database and find out if the indexes you are using are used.
SELECT iv.table_name, i.name AS index_name, iv.seeks + iv.scans + iv.lookups AS total_accesses, iv.seeks, iv.scans, iv.lookups, t.indextype, t.indexsizemb FROM (SELECT i.object_id, Object_name(i.object_id) AS table_name, i.index_id, SUM(i.user_seeks) AS seeks, SUM(i.user_scans) AS scans, SUM(i.user_lookups) AS lookups FROM sys.tables t INNER JOIN sys.dm_db_index_usage_stats i ON t.object_id = i.object_id GROUP BY i.object_id, i.index_id) AS iv INNER JOIN sys.indexes i ON iv.object_id = i.object_id AND iv.index_id = i.index_id INNER JOIN (SELECT sys_schemas.name AS schemaname, sys_objects.name AS tablename, sys_indexes.name AS indexname , sys_indexes.type_desc AS indextype , CAST(partition_stats.used_page_count * 8 / 1024.00 AS DECIMAL(10, 3)) AS indexsizemb FROM sys.dm_db_partition_stats partition_stats INNER JOIN sys.indexes sys_indexes ON partition_stats.[object_id] = sys_indexes.[object_id] AND partition_stats.index_id = sys_indexes.index_id AND sys_indexes.type_desc <> 'HEAP' INNER JOIN sys.objects sys_objects ON sys_objects.[object_id] = partition_stats.[object_id] INNER JOIN sys.schemas sys_schemas ON sys_objects.[schema_id] = sys_schemas.[schema_id] AND sys_schemas.name <> 'SYS') AS t ON t.indexname = i.name AND t.tablename = iv.table_name
I generally track indexes that were never used or were not used a few months after rebooting SQL Server, and determine whether to delete them or not. Too many indexes can sometimes slow down SQL Server using the best path to run the query, and deleting unused indexes can speed up the process.
Hope this helps to understand your indexes.