Wednesday 4 November 2009

A relevant choice of Index options?

I have recently had reason to review the indexing of a table as a query was taking a lot longer to run than I would have expected.

My first move was to check out the estimated execution plan for the SQL that the view was based on. It turns out that a large chunk of effort in the query was running an index scan on a table that had 3 indexes. . The indexes details were: a primary key (unique, clustered), a single column index (based on one of the most common client references in our system) and a third index on some key columns that describe our clients. The Primary Key column was an INT IDENTITY column - basically just a unique ID in the table for absolute reference to asingle row if needed. It plays no part in our business.

Have you spotted what's wrong here?

Exactly, having the ID column as a CLUSTERED index means that table is sorted on disk in the order of that column, this is like having a library sort it's books by the height of the respective authors - useless when you need to find anything. You would have to run up and down the shelves to find anything you need.

I changed the PRIMARY KEY to be non-clustered and then changed the single column index to be the CLUSTERED index for the table and the performance of the query went from 7s down to 1s (YMMV). WIN!!

This got me thinking whether this may be a scenario that is repeated in other tables or other databases on our servers so I have written the query below to track down indexes that are clustered primary keys and returns the table/column name and the data type of that column. Simply reviewing the index name (columns called ID are suspicious!) and whether its an INT IDENTITY column mean I can check out anything that looks awry and improve our systems performance.

SELECT
OBJECT_NAME([i].[object_id]) AS [Table] ,
[i].[name] AS [Index name] ,
[c].name AS [Column name] ,
[t].[name] AS [Column type] ,
[c].[is_identity]
FROM
[sys].[indexes] AS i
INNER JOIN [sys].[index_columns] AS ic
ON i.index_id = ic.index_id
AND i.object_id = ic.object_id
INNER JOIN sys.columns AS c
INNER JOIN [sys].[types] AS t
ON [c].[system_type_id] = [t].[system_type_id]
ON c.column_id = ic.column_id
AND ( ic.[object_id] = i.[object_id]
AND ic.[object_id] = c.[object_id]
)
WHERE
[i].[is_primary_key] = 1
AND [i].[type] = 1

Personally where the column type is a INT and is_identity is true then I am off to look into making similar changes across our servers...

No comments:

AddThis

Bookmark and Share
 
Google Analytics Alternative