Saturday, August 1, 2009

Using SQL Server 2005+ index included columns to improve performance of an ASP.net resource

This post shows a real scenario on using included columns on SQL Server 2005/2008 to improve the response time of an ASP.net resource.

When developing DotNetBurner, i wanted to make sure that the site performance was the best it could be. Targeting this goal, i have used indexes with included columns to improve the response time of the "burn!" count image. The burn count image shows how many "burns" (votes) a particularly link has. So, to retrieve this vote count, the database must be queried, like this:

SELECT VoteCount
FROM Story
WHERE UrlHash = @urlHash

The query result is filtered by the UrlHash column, which is a MD5 hash of the URL. Assuming that we have thousands of stories, it's recommended to create an index on the UrlHash column.

The following queries will create a Story table and populate it with some items, so we can query it and get the execution plan:

IF OBJECT_ID('Story') IS NOT NULL
 DROP TABLE Story
GO

CREATE TABLE Story
(
 StoryId INT PRIMARY KEY NOT NULL,
 Description NVARCHAR(2000),
 Url VARCHAR(255),
 UrlHash UNIQUEIDENTIFIER, /* This is not MD5, but can be used for this example */
 VoteCount SMALLINT
)
GO

DECLARE @count INT

-- Inserts the first story, so we know the UrlHash
INSERT INTO Story (StoryId, Description, Url, UrlHash, VoteCount)
VALUES(1, 'This is a story description where the quick brown fox jumped over the lazy dog which had a tatoo written lorem ipsum dolor.',
    'http://www.someurl.com', '5946D715-38FA-42C5-8056-F8F0C86B8AE8', 2)

-- Inserts more 9998 stories
SET @count = 2
WHILE @count < 10000
BEGIN

 INSERT INTO Story (StoryId, Description, Url, UrlHash, VoteCount)
 VALUES(@count, 'This is a story description where the quick brown fox jumped over the lazy dog which had a tatoo written lorem ipsum dolor.',
     'http://www.someurl.com', NEWID(), @count + 10)

 SET @count = @count + 1
END


CREATE UNIQUE INDEX IDX_Story_1 ON Story(UrlHash)

SELECT VoteCount
FROM Story
WHERE UrlHash = '5946D715-38FA-42C5-8056-F8F0C86B8AE8'

This is the resulting execution plan (with a index on the UrlHash column):

Sql server execution plan

Looking at the execution plan, you will see a key lookup. I bet with you that i can reduce it to one index seek :)

DROP INDEX IDX_Story_1 ON Story

CREATE UNIQUE INDEX IDX_Story_1 ON Story(UrlHash) INCLUDE (VoteCount)

SELECT VoteCount
FROM Story
WHERE UrlHash = '5946D715-38FA-42C5-8056-F8F0C86B8AE8'

Now we have replaced the index which one that includes the VoteCount column. Let’s look at the execution plan:

Sql server execution plan

Now this executes faster and uses less resources. This happens because when we added the included column all the values necessary for the query (UrlHash and VoteCount) can be found on the Index, so SQL Server doesn't need to look for the value of VoteCount at the clustered index.

You could, of course, just add the VoteCount directly on the index, like this:

CREATE UNIQUE INDEX IDX_Story_1 ON Story(UrlHash, VoteCount)

This would work, but would also use more disk space, since the index key would be bigger. This happens because when you use the INCLUDE option SQL Server just stores the value at the leaf node of the index, because this value is just returned, not filtered.

No comments:

Post a Comment