Welcome to Aussie SQL Server Bloggers Sign in | Join | Help

Debunking myths about clustered indexes - part 3 (example script)

So far in this series of posts, I have discussed how clustered indexes (CIXs) can be less efficient for both bookmark lookups & range scans than non-clustered indexes (NCIXs). In this post, I'm providing a repro script that demonstrates these two points.

The script creates a table named people & populates it with randomly generated data to represent somewhat realistic data, in that rows contain variable width values for the lname & fname columns. Random algorithms used are based on those provided by Vyas at http://vyaskn.tripod.com

The script has been tested against SQL 2005, but should also run on SQL 2000. On my notebook, this script takes ~ 20 seconds to complete. Progress messages are printed during construction of the test data, so you can fiddle with the parameters if you'd like to test the effects of using larger or smaller amounts of data.

After running the script, compare the number of reads reported by the set statistics io report & you should see that the reads performed by the queries are greater when CIXs are present on the table.

Note also that this script installs the CIX on (lname, fname), representing a common scenario where the CIX is placed on a combination of columns which the database designer thinks a physical ordering might benefit overall performance of the database. Another reason this type of scenario is so common is that, by default, SQL Server creates a CIX on the primary key of each table (unless the PK is declaratively created as non-clustered). When PKs are created on multiple columns (as often happens with natural keys) using the default clustered index, it is easy to have a similar scenario - where CIXs on multiple columns are created, affecting the performance of other NCIXs.

set nocount on

go

use tempdb

go

/*

* BEGIN SETUP people table

*/

if object_id('people') is not null drop table people

create table people (

personid int not null identity (1, 1) primary key nonclustered

, lname varchar (25) not null

, fname varchar (25) not null

, age tinyint not null check (age between 0 and 100)

, sex character (1) not null check (sex in ('m', 'f'))

, notes varchar (512) not null)

declare @l int, @fn varchar(25), @ln varchar(25), @bm char(6)

declare @i int, @report_every int, @rows_to_insert int, @p varchar (50)

select @i = 1, @rows_to_insert = 100000, @report_every = 10000

while @i < @rows_to_insert

begin

if @i % @report_every = 0

begin

select @p = convert(varchar(25), @i) + ' from '

+ convert(varchar(25), @rows_to_insert) + ' inserted'

print @p

end

/* generate a fname of random length (between 1 to 25 chrs), of random chars */

select @l = round(1 + (rand() * (25-1)),0)

select @fn='', @bm = 'uaeioy'

while @l > 0

begin

if (@l%2) = 0

set @fn = @fn + substring(@bm,convert(int,round(1 + (rand() * (5)),0)),1)

else

set @fn = @fn + char(round(97 + (rand() * (25)),0))

set @l = @l - 1

end

/* generate a lname of random length (between 1 to 25 chrs), of random chars */

select @l = round(1 + (rand() * (25-1)),0)

select @ln='', @bm = 'goeuai'

while @l > 0

begin

if (@l%2) = 0

set @ln = @ln + substring(@bm,convert(int,round(1 + (rand() * (5)),0)),1)

else

set @ln = @ln + char(round(97 + (rand() * (25)),0))

set @l = @l - 1

end

--insert row

insert into people (lname, fname, age, sex, notes)

values (

@ln

, @fn

, round(1 + (rand() * (100-1)),0)

, case when round(1 + (rand() * (2-1)),0) = 1 then 'm' else 'f' end

, replicate ('*', round(1 + (rand() * (512-1)),0)))

set @i = @i + 1

end

/*

* END SETUP people table

*/

print ' '

print 'TEST 1: Demonstrates how CIX can affect NCIX range scan performance (with covering index)'

create clustered index cix_people on people (lname, fname)

create index ncix_people on people (age, sex)

print 'QUERY: select age, sex into catchresults from people where age = 25 and sex = ''f'''

print 'with CIX'

set statistics io on

select age, sex into catchresults from people where age = 25 and sex = 'f'

set statistics io off

if object_id('catchresults') is not null drop table catchresults

drop index people.cix_people

print 'without CIX'

set statistics io on

select age, sex into catchresults from people where age = 25 and sex = 'f'

set statistics io off

if object_id('catchresults') is not null drop table catchresults

create clustered index cix_people on people (lname, fname)

print ' '

print 'TEST 2: Demonstrates how CIX can affect bookmark performance (without covering index)'

print 'QUERY: select age, sex, personid into catchresults from people where age = 25 and sex = ''f'''

print 'with CIX'

set statistics io on

select age, sex, personid into catchresults from people where age = 25 and sex = 'f'

set statistics io off

if object_id('catchresults') is not null drop table catchresults

drop index people.cix_people

print 'without CIX'

set statistics io on

select age, sex, personid into catchresults from people where age = 25 and sex = 'f'

set statistics io off

if object_id('catchresults') is not null drop table catchresults

if object_id('people') is not null drop table people

Published Saturday, 16 September 2006 11:50 AM by Greg_Linwood
Filed Under:

Comments

# re: Debunking myths about clustered indexes - part 3 (example script)

Hi Greg,

When playing aroound with this script, I found that it doesn't actually test what it should test. For this query:

select age, sex, personid into catchresults from people where age = 25 and sex = 'f'

the execution plan showed an index seek on ncix_people, an index scan on the index generated for the primary key on personid, and a hash match (inner join) operation to combine the results from those index operations.

To get the plan you intended (using an index seek on ncix_people and a bookmark lookup or clustered index lookup), you'll have to change the query to

select age, sex, notes into catchresults from people where age = 25 and sex = 'f'

Best, Hugo
Saturday, 4 November 2006 8:33 AM by Hugo Kornelis

# Debunking myths about clustered indexes - part 5 - a real world example


This post discusses a problem experienced last year on one of the highest profile&amp;nbsp;websites in...
Friday, 8 June 2007 12:52 AM by Transaction blog
Anonymous comments are disabled