SQL Server Clustered Index Design For Performance

SQL Server clustered index design for performance
Clustered indexes in SQL Server are a critical consideration in the overall architecture of the database. They are often overlooked, misunderstood or, if the database is small, considered unimportant. This article points out the importance of clustered indexes for overall system performance and maintenance as your database grows. will briefly cover how SQL Server clustered indexes are stored on disk, why they should always increase over time and why it is best that clustered indexes be static. !ll also touch on many"to"many tables, why they are used and how clustered indexes make these tables more efficient. #inally, it is absolutely critical that we touch on the new SQL Server $%%& partitioned table concept and examine how partitioned tables affect clustered indexes. This will help you make the right decisions from the very start. Clustered indexes are created by default to match the primary key, which is defined on tables in SQL Server. 'owever, you can create a clustered index on any column and then define a primary key on a separate column or columns. (t this point, the primary key would be created as a uni)ue non"clustered index. Typically, a clustered index will match the primary key, but not necessarily, so be careful. *iven the variety of situations that can arise, !ll be discussing the clustered indexes themselves, and for now ignore whether you choose to make them primary keys. Clustered indexes actually hold the row data for SQL Server, so wherever your clustered indexes are stored is also where your data is stored. The clustered indexes are organi+ed into ranges of data. #or example, values , to ,% may be stored in one range and -% to ,,% in another range. Since clustered indexes are stored as ranges, if you need to do a search on a range for an audit log, it would be more efficient for the clustered index to be based on the date column that would be used to return the date ranges. .on"clustered indexes work better for specific value searches, e.g. /date 0 1ate2alue,/ rather than range searches, e.g. /date between date, and date$./ Ever-increasing values for clustered indexes Clustered indexes should be based on columns whose values constantly increase over time. n my prior example on using the date column from an audit log, the date values for an audit log would be constantly increasing and older dates would not be inserted into the table. This would be an /ever"increasing/ column. (nother good example of an ever" increasing value is an identity column, since, by design, it constantly increases. 3hy am spending so much time discussing ever"increasing values for clustered indexes4 The most important attributes of clustered indexes is that they are ever"
increasing and static in nature. The reason ever"increasing is so important has to do with the range architecture outlined earlier. f the values are not ever"increasing, then SQL Server has to allocate space within existing ranges for those records rather than placing them in new ranges at the end of the index. f the values are not ever"increasing, then once the ranges fill up and a value comes in that fits within a filled up index range, SQL Server will make room in an index by doing a page split. nternally, SQL Server takes the filled up page and splits it into two separate pages that have substantially more room at that point but take significantly more resources to process. 5ou can prepare for this eventuality by setting a fill factor of 6%7 or so, which gives you 8%7 free space for incoming values. The problem with this approach is that you continually have to /reindex/ the clustered index so it maintains a free space percentage of 8%7. 9eindexing the clustered index will also cause heavy :; load since it has to move the actual data itself and any non"clustered indexes have to be rebuilt, adding greatly to maintenance time. f the clustered index is ever"increasing, you will not have to rebuild the clustered index< you can set a ,%%7 fill factor on the clustered index, and at that point you will only need to reindex the less"intensive, non"clustered indexes as time progresses, resulting in more up time. =ver"increasing values will only add entries to the end of the index and build new ranges when necessary. Logical fragmentation will not exist since the new values are continually added to the end of the index and the fill factor will be ,%%7. The higher the fill factor, the more rows are stored on each page. 'igher fill factors re)uire less :;, 9(> and C?@ for )ueries. The smaller the data types you pick for the clustered index, the faster the Aoins:)ueries will be. (lso, since each non"clustered index re)uires it to contain the clustered index key, then the smaller the clustered index key and the smaller the non" clustered indexes will be. The best data types for clustered indexes are generally pretty narrow. 9eferring to data type si+e, it!s typically a smallint, int, bigint or datetime. 3hen datetime values are used as the clustering index, they are the only column and are normally ever"increasing date values that are often )ueried as range data. *enerally, you should avoid compound Bmultiple columnsC clustered indexes except in the following situationsD many"to"many tables and SQL Server $%%& partitioned tables that have the partitioning column included as part of the clustered index to allow for index alignment. Many-to-many tables and clustered indexes >any"to"many tables are used for their extremely fast Aoin capabilities and their ability to allow for )uick re"association of records, from one owning record to another. Consider the following structureD Customer
Customer 1 Bbigint identityC .ame #ieldnE Customer;rder
Customer 1 ;rder 1 ;rders
;rder 1 Bbigint identityC 1ate #ieldnE The clustered indexes in this structure would be Customer 1, ;rder 1. The compound would be Customer 1:;rder 1. 'ere are the benefits with this structureD

The Aoins are all based on clustered indexes Bmuch faster than Aoins to non" clustered indexesC. >oving an order to another customer only involves an update to the Customer;rder table, which is very narrow, with only one clustered index. Therefore, it reduces the blocking that would occur if you had to update a wider table such as ;rders. @se of a many"to"many table eliminates the need for some non"clustered indexes on the wider tables such as Customer:;rders. 'ence, it reduces the maintenance time on the large tables.
;ne negative result of this approach is the fragmentation that occurs on the Customer;rder table. 'owever, that should not be a big issue, since the table is relatively narrow, has only two columns with narrow data types and only one clustered index. The elimination of the non"clustered indexes, which would be needed on the ;rders table if it contained Customer 1, more than makes up for this cost. Clustered indexes and partitioned tables in SQL Server 2005 ?artitioned tables in SQL Server $%%& are tables that appear to be a single table on the surface, but behind the scenes F"" at the storage subsystem level "" they are actually multiple partitions that can be spread across many filegroups. The table partitions are spread across various filegroups based on the values in a single column. ?artitioning tables in this manner causes several side effects. will Aust cover the basics here, to give you some understanding of the factors involved. recommend that you study partitioned tables before attempting to implement them. 5ou can create a clustered index in this environment based on only one column.
Gut, if that one column is not the column the table is partitioned on, then the clustered index is said to be non"aligned. f a clustered index is non"aligned, then any snapping in:out Bor mergingC of partitions will re)uire you to drop the clustered index along with the non"clustered indexes and rebuild them from scratch. This is necessary because SQL Server cannot tell what portions of the clustered:non"clustered indexes belong to which table partitions. .eedless to say, this will certainly cause system downtime. The clustered index on a partitioned table should always contain the regular clustering column, which is ever"increasing and static, as well as the column that is used for partitioning the table. f the clustered index includes the column used for partitioning the table, then SQL Server knows what portion of the clustered:non"clustered indexes belong to which partition. ;nce a clustered index contains the column that the table is partitioned on, then the clustered index is /aligned./ ?artitions can then be snapped in:out Band mergedC without rebuilding the clustered:non"clustered indexes, causing no downtime for the system. nserts:updates:deletes will also work faster, because those operations only have to consider the indexes that reside on their particular partition. Summary SQL Server clustered indexes are an important part of database architecture and hope you!ve learned enough from this article to know why you need to carefully plan for them from the very start. t is vital for the future health of your database that clustered indexes be narrow, static and ever"increasing. Clustered indexes can help you achieve faster Aoin times and faster @1 operations and minimi+e blocking as the system becomes busy. #inally, we covered how partitioned tables in SQL Server $%%& affect your choices for the clustered index, what it means to /align/ the clustered index with the partitions, and why clustered indexes have to be aligned in order for the partitioned table concept to work as intended. Heep watching for tips on non"clustered indexes Bpart twoC coming in #ebruary and optimal index maintenance Bpart threeC in >arch.
esigning SQL Server non-clustered indexes for !uery optimi"ation

.on"clustered indexes are bookmarks that allow SQL Server to find shortcuts to the data you!re searching for. .on"clustered indexes are important because they allow you to focus )ueries on a specific subset of the data instead of scanning the entire table. 3e!ll address this critical topic by first hitting the basics, such as how clustered indexes interact with non"clustered indexes, how to pick fields, when to use compound indexes and how statistics influence non"clustered indexes. #$e basics of non-clustered indexes in SQL Server
( non"clustered index consists of the chosen fields and the clustered index value. f the clustered index is not defined as uni)ue, then SQL Server will use a clustered index value plus a uni)ueness value. (lways define your clustered indexes as uni)ue "" if they are in fact uni)ue "" because it will result in a smaller clustered:non"clustered index si+e. f your uni)ue clustered index consists of an int and you create a non"clustered index on a year column Bdefined as smallintC, then your non"clustered index will contain an int and smallint for every row in the table. The si+e would increase according to the data types chosen. So the smaller the clustered:non"clustered index data types are, the smaller the resulting index si+e will be, and the maintenance capacity will increase. C$oosing fields for non-clustered indexes The first rule is to never include the clustered index key fields in the non"clustered index. The field is already part of the clustered index, so it will always be used for )ueries. The only time it makes sense to include any clustered index key in a non"clustered index is when the clustered index is a compound index and the )uery is referencing the second, third or higher field in the compound index. (ssume you have the following tableD
1 Bidentity, clustered uni)ueC 1ate#rom 1ateTo (mt 1ate nserted 1escription .ow assume you always run )ueries such asD Example %&
Select * From tbl [t] where t.datefrom = '12/12/2006' and t.DateTo = '12/31/2006' and t.Date n!erted = '12/01/2006'
(t this point it makes sense to have a non"clustered index defined on 1ate#rom, 1ateTo and 1ate nserted, since that will always give the best uni)ue results. .ow assume you run multiple )ueries such asD Example 2&
Select * From tbl [t] where t.datefrom = '12/12/2006' and t.Date n!erted = '12/01/2006' Select * From tbl [t] where t.datefrom = '12/12/2006'
Select * From tbl [t] where t.DateTo = '12/31/2006' Select * From tbl [t] where t.Date n!erted = '12/01/2006' Select * From tbl [t] where t.DateTo = '12/31/2006' and t.Date n!erted = '12/01/2006' Select * From tbl [t] where t."d = # and t.DateTo = '12/31/2006' and t.Date n!erted = '12/01/2006'
>any people, at this point, would be tempted to create the following non"clustered indexesD ,. $. 8. I. &. J. 1ate#rom 1ateTo 1ate nserted 1ateTo and 1ate nserted 1ate#rom and 1ate nserted 1, 1ateTo and 1ate nserted
5ou probably expect the index si+e to increase dramatically at this point, since you are storing 1ate#rom in two separate locations, 1ateTo in three locations and 1ate nserted in four locations. ;n top of this, you!ve stored the clustered index key in seven locations. This approach increases :; for insert, update and delete operations Balso known as @1 operationsC. @pdates to the records must be written first to the clustered index data row. Then, the non"clustered indexes will have to be updated so they can be written to. 5ou should routinely ask yourself these )uestionsD s the cost of additional :; for @1 operations and maintenance worth the improved )uery time4 3ill the additional :; and increased maintenance time outweigh any performance boost get on the )ueries4 3hat will give me the most uni)ue results with the least overhead as possible4 n this case, the best solution would be three non"clustered indexes as followsD ,. 1ate#rom $. 1ateTo 8. 1ate nserted
=ach field in this scenario is only stored once, except for the primary key which is stored on all three non"clustered indexes. (s a result, the index si+e is much smaller and will re)uire less :; and less maintenance. SQL Server will )uery each of the non"clustered indexes, depending on the criteria chosen, and then hash the results together. 3hile this is not as efficient as =xample ,, it is much more efficient than defining the five separate non"clustered indexes. 9eal world )ueries will more often match Example 2 rather than being structured as Example %. SQL Server statistics Statistics tell SQL Server how many rows most likely match a given value. t gives SQL Server an idea of how /uni)ue/ a value is, information it then uses to determine whether to use an index. Gy default, SQL Server automatically updates statistics whenever it thinks approximately $%7 of the records have changed. n SQL Server $%%%, this is done synchronously with the @1 operation, delaying the completion of the @1 operation while the rows are sampled. n SQL Server $%%&, you can have it sample either synchronously with the @1 operation or asynchronously after the @1 operation is done. The latter approach is better and will cause less blocking because locks will be released sooner. recommend turning off the database setting /(uto @pdate Statistics./ This setting will increase your server loads at the worst times. nstead of letting SQL Server automatically keep statistics up to date, create a Aob that calls the command /update statistics/ and runs during your slowest time. 5ou can pick your own sampling ratio depending on how accurate you want the statistics to be. Statistics are only kept on the first column in any non"clustered index. 3hat does this mean in compound non"clustered indexes4 t means SQL Server will use the first field to determine whether an index should be used. =ven if the second field in the compound index will match &%7 of the rows, the field still needs to be used to return the results Bsee =xample 8C. .ow, if the non"clustered index were split into two non"clustered indexes, SQL Server might choose to use index ,, but not index $. This is because the statistics on index $ may show that it will not benefit the )uery Bsee =xample IC. Example ' (ssume you have a compound, non"clustered index defined on 1ate#rom and (mt. Statistics would only be kept on the 1ate#rom field within the index, and SQL Server would have to seek Bor scanC across both 1ate#rom and (mt. Since SQL Server has to traverse more data, the )uery will be slower. Example ( (ssume you have two non"clustered indexesD The first is defined on 1ate#rom and the second is defined on (mt.
Statistics would be kept on both fields because they are separate indexes. SQL Server will examine the statistics on 1ate#rom and decide to use that index. t will then examine the (mt column and may decide "" based on the statistics "" that the index is not uni)ue enough and should be ignored. (t this point, SQL Server would only need to traverse the 1ate#rom field, rather than both 1ate#rom and (mt, resulting in a faster )uery. Gy using non"clustered indexes in SQL Server, you!ll be able to focus )ueries on a data subset. @se the guidelines described in this tip to determine if it!s best to create multiple non"clustered indexes or a compound non"clustered index. (lso keep in mind the role of statistics and how they impact non"clustered indexesD Statistics affect the choice between using multiple non"clustered indexes and a compound non"clustered index in SQL Server.
)o* to maintain SQL Server indexes for !uery optimi"ation

>aintaining SQL Server indexes is an uncommon practice. f a )uery stops using indexes, oftentimes a new non"clustered index is created that simply holds a different combination of columns or the same columns. ( detailed analysis on why SQL Server is ignoring those indexes is not explored. Let!s take a look at how clustered and non"clustered indexes are selected and why )uery optimi+er might choose a table scan instead of a non"clustered index. n this tip, you!ll learn how page splits, fragmented indexes, table partitions and statistics updates affect the use of indexes. @ltimately, you!ll find out how to maintain SQL Server indexes so that )uery optimi+er uses these indexes, and so these indexes are searched )uickly. +ndex selection Clustered indexes are by far the easiest to understand in the area of index selection. Clustered indexes are basically keys that reference each row uni)uely. =ven if you define a clustered index and do not declare it as uni)ue, SQL Server still makes the clustered index uni)ue behind the scenes by adding a I"byte /uni)ueifier/ to it. The additional /uni)ueifier/ increases the width of the clustered index, which causes increased maintenance time and slower searches. Since clustered indexes are the key that identifies each row, they are used in every )uery. 3hen we start talking about non"clustered indexes, things get confusing. Queries can ignore non"clustered indexes for the following reasonsD ,. 'igh fragmentation K f an index is fragmented over I%7, the optimi+er will probably ignore the index because it!s more costly to search a fragmented index than to perform a table scan.
$. @ni)ueness K f the optimi+er determines that a non"clustered index is not very uni)ue, it may decide that a table scan is faster than trying to use the non" clustered index. #or exampleD f a )uery references a bit column Bwhere bit 0 ,C and the statistics on the column say that 6&7 of the rows are ,, then the optimi+er will probably decide a table scan will get the results faster versus trying to scan over a non"clustered index. 8. ;utdated statistics K f the statistics on a column are out of date, then SQL Server can misguide the benefit of a non"clustered index. (utomatically updating statistics doesn!t Aust slow down your data modification scripts, but over time it also becomes out of sync with the real statistics of the rows. ;ccasionally it!s a good idea to run spLupdatestats or @?1(T= ST(T ST CS. I. #unction usage K SQL Server is unable to use indexes if a function is present in the criteria. f you!re referencing a non"clustered index column, but you!re using a function such as convertBvarchar, Col,L5earC 0 $%%I, then SQL Server cannot use the index on Col,L5ear. &. 3rong columns K f a non"clustered index is defined on Bcol,, col$, col8C and your )uery has a where clause, such as /where col$ 0 !somevalue!/, that index won!t be used. ( non"clustered index can only be used if the first column in the index is referenced within the where clause. ( where clause, such as /where col8 0 !someval!/, would not use the index, but a where clause, like /where col, 0 !someval!/ or /where col,0!someval and col8 0 !someval$!/ would pick up the index. The index would not use col8 for its seek, since that column is not after col, in the index definition. f you wanted col8 to have a seek occur in situations such as this, then it is best if you define two separate non"clustered indexes, one on col, and the other on col8. ,age splits To store data, SQL Server uses pages that are M kb data blocks. The amount of data filling the pages is called the fill factor, and the higher the fill factor, the more full the M kb page is. ( higher fill factor means fewer pages will be re)uired resulting in less ;:C?@:9(> usage. (t this point, you might want to set all your indexes to ,%%7 fill factor< however, here is the gotchaD ;nce the pages fill up and a value comes in that fits within a filled"up index range, then SQL Server will make room in an index by doing a /page split./ n essence, SQL Server takes the full page and splits it into two separate pages, which have substantially more room at that point. 5ou can account for this issue by setting a fill" factor of 6%7 or so. This allows 8%7 free space for incoming values. The problem with this approach is that you continually have to /re"index/ the index so that it maintains a free space percentage of 8%7. Clustered index maintenance Clustered indexes that are static or /ever"increasing/ should have a fill factor of ,%%7. Since the values are always increasing, pages will Aust be added to the end of the index
and virtually no fragmentation will occur. #or a more detailed explanation, see part , of this series, SQL Server clustered index design for performance. This index category does not need to be re"indexed because it doesn!t fragment. Clustered indexes that are either not static or /ever"increasing/ will experience fragmentation and page splits as the data rows move around within the data pages. The indexes in this category have to be re"indexed in order to keep fragmentation low and allow )ueries to efficiently use the index. 3hen you re"index these clustered indexes, you have to decide what the fill factor should be. .ormally this is 6%7 to M%7, giving you $%7 to 8%7 empty space for new records coming into the page. The optimal settings for your environment will depend on how often records shift around, how many records are inserted and how often re"indexing occurs. The goal is to set a fill factor low enough so that by the time you reach your next maintenance cycle, the pages are around -&7 full, but not yet splitting, which happens when they hit the ,%%7 limit. -on-clustered index maintenance .on"clustered indexes will always have data shifting around the pages. t!s not )uite as big of an issue like it is with clustered indexes "" the actual row data shifts with clustered indexes, whereas only row pointers shift with non"clustered indexes. That said, the same rules apply to non"clustered indexes as far as fill factors go. (gain, the goal is to set a fill factor low enough so that by the time you reach your next maintenance cycle, the pages are only around -&7 full. .on"clustered indexes will always fragment, and to avoid this you must constantly monitor and maintain them. ,artitioned table index considerations ?artitioned tables allow data to be segregated into different partitions, depending on the data in a column. >any tables are partitioned based on date ranges. Let!s say your order table is partitioned into years. (ssuming the clustered index is aligned Bsee part , of this seriesC, then you could re"index the non"clustered indexes for, say, year $%%% at ,%%7 fill factor, since that data, technically, won!t be shifting around. n this scenario, the year $%%M partition may have a fill factor of 6%7 on non"clustered indexes to allow for data shifts, but the year $%%% will not have any shifts and can be re"indexed at ,%%7 fill factor so you optimi+e index seeks. The same concept would apply to clustered indexes that are either not static or ever" increasing. Clustered indexes with shifting data might be set to 6%7 fill factor for the year $%%M partition and ,%%7 fill factor for the year $%%%. SQL Server statistics
Statistics are maintained on columns and indexes and they help SQL Server determine how /uni)ue/ some value may be "" i.e., if statistics say a value will match approximately M%7 of the rows, SQL Server will do a table scan instead. f statistics say a value will probably match around ,%7 of the rows, then the )uery optimi+er will opt for a seek to minimi+e database impact. SQL Server statistics can be maintained automatically or you can run them manually. Since re"indexing changes the statistics results, recommend that after re"indexing, you manually run spLupdatestats or the T"SQL @?1(T= ST(T ST CS command. Statistics are only maintained on the first column of any compound index, so the /uni)ueness/ of other columns in the index cannot be determined. Summary ndex maintenance is critical to ensure that )ueries continue to benefit from index use and to reduce ;:9(>:C?@, which reduces blocking as well. 9un your )ueries with the option /show execution plan/ turned on. f the )uery is not using your index, then check the followingD ,. 9un dbcc showcontig B!tablename!C to see if the table is fragmented. $. Check your /where clause/ to see if it references the first column in the index. 8. =nsure that your /where clause/ does not have a function for the criteria for the first column of the index. I. @pdate the statistics Aust in case they are out of date. f the table is fragmented, then run this step after re"indexing. &. >ake sure the criteria you are using is uni)ue enough and that SQL Server will see a benefit in using it to search the data.
(1) How can I find out whether my indexes are useful? How are they used?
#irst, we will determine whether indexes are NusefulO. 11L is used to create obAects Bsuch as indexesC and update the catalog. Creating the index does not constitute NuseO of the index, and thus the index will not be reflected in the index 1>2s until the index is actually used. 3hen an index is used by a Select, nsert, @pdate, or 1elete, its use is captured by sys.dmLdbLindexLusageLstats. f you have run a representative workload, all useful indexes will have been recorded in sys.dmLdbLindexLusageLstats. Thus, any index not found in sys.dmLdbLindexLusageLstats is unused by the workload Bsince the last re"cycle of SQL ServerC. @nused indexes can be found as followsD B$C 1o have any tables or indexes that are not used Bor rarely usedC4 """""" unused tables P indexes. Tables have indexLidOs of either % 0 'eap table or , 0 Clustered ndex 1eclare Qdbid int
Select Qdbid 0 dbLidB!.orthwind!C Select obAectname0obAectLnameBi.obAectLidC , indexname0i.name, i.indexLid from sys.indexes i, sys.obAects o where obAectpropertyBo.obAectLid,! s@serTable!C 0 , and i.indexLid .;T . Bselect s.indexLid from sys.dmLdbLindexLusageLstats s where s.obAectLid0i.obAectLid and i.indexLid0s.indexLid and databaseLid 0 Qdbid C and o.obAectLid 0 i.obAectLid order by obAectname,i.indexLid,indexname asc 9arely used indexes will appear in sys.dmLdbLindexLusageLstats Aust like heavily used indexes. To find rarely used indexes, you look at columns such as userLseeks, userLscans, userLlookups, and userLupdates. """ rarely used indexes appear first declare Qdbid int select Qdbid 0 dbLidBC select obAectname0obAectLnameBs.obAectLidC, s.obAectLid, indexname0i.name, i.indexLid , userLseeks, userLscans, userLlookups, userLupdates from sys.dmLdbLindexLusageLstats s, sys.indexes i where databaseLid 0 Qdbid and obAectpropertyBs.obAectLid,! s@serTable!C 0 , and i.obAectLid 0 s.obAectLid and i.indexLid 0 s.indexLid order by BuserLseeks E userLscans E userLlookups E userLupdatesC asc B8C 3hat is the cost of index maintenance vs. its benefit4 f a table is heavily updated and also has indexes that are rarely used, the cost of maintaining the indexes could exceed the benefits. To compare the cost and benefit, you can use the table valued function sys.dmLdbLindexLoperationalLstats as followsD """ sys.dmLdbLindexLoperationalLstats declare Qdbid int select Qdbid 0 dbLidBC select obAectname0obAectLnameBs.obAectLidC, indexname0i.name, i.indexLid , reads0rangeLscanLcount E singletonLlookupLcount , !leafLwrites!0leafLinsertLcountEleafLupdateLcountE leafLdeleteLcount , !leafLpageLsplits! 0 leafLallocationLcount , !nonleafLwrites!0nonleafLinsertLcount E nonleafLupdateLcount E nonleafLdeleteLcount , !nonleafLpageLsplits! 0 nonleafLallocationLcount
from sys.dmLdbLindexLoperationalLstats BQdbid,.@LL,.@LL,.@LLC s, sys.indexes i where obAectpropertyBs.obAectLid,! s@serTable!C 0 , and i.obAectLid 0 s.obAectLid and i.indexLid 0 s.indexLid order by reads desc, leafLwrites, nonleafLwrites """ sys.dmLdbLindexLusageLstats select obAectname0obAectLnameBs.obAectLidC, indexname0i.name, i.indexLid ,reads0userLseeks E userLscans E userLlookups ,writes 0 userLupdates from sys.dmLdbLindexLusageLstats s, sys.indexes i where obAectpropertyBs.obAectLid,! s@serTable!C 0 , and s.obAectLid 0 i.obAectLid and i.indexLid 0 s.indexLid and s.databaseLid 0 Qdbid order by reads desc go The difference between sys.dmLdbLindexLusageLstats and sys.dmLdbLindexLoperationalLstats is as follows. Sys.dmLdbLindexLusageLstats counts each access as ,, whereas sys.dmLdbLindexLoperationalLstats counts depending on the operation, pages or rows. BIC 1o have hot spots P index contention4 ndex contention Be.g. waits for locksC can be seen in sys.dmLdbLindexLoperationalLstats. Columns such as rowLlockLcount, rowLlockLwaitLcount, rowLlockLwaitLinLms, pageLlockLcount, pageLlockLwaitLcount, pageLlockLwaitLinLms, pageLlatchLwaitLcount, pageLlatchLwaitLinLms, pageioLlatchLwaitLcount, pageioLlatchLwaitLinLms detail lock and latch contention in terms of waits. 5ou can determine the average blocking and lock waits by comparing waits to counts as followsD declare Qdbid int select Qdbid 0 dbLidBC Select dbid0databaseLid, obAectname0obAectLnameBs.obAectLidC , indexname0i.name, i.indexLid "", partitionLnumber , rowLlockLcount, rowLlockLwaitLcount , Rblock 7S0cast B,%%.% T rowLlockLwaitLcount : B, E rowLlockLcountC as numericB,&,$CC , rowLlockLwaitLinLms , Ravg row lock waits in msS0cast B,.% T rowLlockLwaitLinLms : B, E rowLlockLwaitLcountC as numericB,&,$CC
from sys.dmLdbLindexLoperationalLstats BQdbid, .@LL, .@LL, .@LLC s, sys.indexes i where obAectpropertyBs.obAectLid,! s@serTable!C 0 , and i.obAectLid 0 s.obAectLid and i.indexLid 0 s.indexLid order by rowLlockLwaitLcount desc The following report shows blocks in the R;rder 1etailsS table, index ;rders;rderL1etails. 3hile blocks occur less than $ percent of the time, when they do occur, the average block time is ,&.6 seconds. t would be important to track this down using the SQL ?rofiler Glocked ?rocess 9eport. 5ou can set the Glocked ?rocess Threshold to ,& using spLconfigure NGlocked ?rocess ThresholdO,,&. (fterwards, you can run a trace to capture blocks over ,& seconds. The ?rofiler trace will include the blocked and blocking process. The advantage of tracing for long blocks is the blocked and blocking details can be saved in the trace file and can be analy+ed long after the block disappears. 'istorically, you can see the common causes of blocks. n this case the blocked process is the stored procedure .ewCust;rder. The blocking process is the stored procedure @pdCust;rderShipped1ate. The caveat with ?rofiler Trace of Glocked ?rocess 9eport is that in the case of stored procedures, you cannot see the actual statement within the stored procedure that is blocked. 5ou do however, get the stmtstart and stmtend offset that does identify the statement blocked inside the stored procedure .ewCust;rder. @sing the above blocked process report, you could extract the blocked statement out of the .ewCust;rder stored procedure by providing the sqlhandle, stmtstart and stmtend as followsD declare Qs)lLhandle varbinaryBJIC, Qstmtstart int, Qstmtend int Select Qs)lLhandle 0 %x8%%%%&%%%&d-fJ6eaMI$&8%,%&-6%%%%%,%%%%%%%%%%%%%% Select Qstmtstart 0 -$%, Qstmtend 0 ,%JI select substringB)t.text,s.statementLstartLoffset:$, Bcase when s.statementLendLoffset 0 ", then lenBconvertBnvarcharBmaxC, )t.textCC T $ else s.statementLendLoffset end "s.statementLstartLoffsetC:$C as /blocked statement/ ,s.statementLstartLoffset ,s.statementLendLoffset
,batch0)t.text ,)t.dbid ,)t.obAectid ,s.executionLcount ,s.totalLworkerLtime ,s.totalLelapsedLtime ,s.totalLlogicalLreads ,s.totalLphysicalLreads ,s.totalLlogicalLwrites from sys.dmLexecL)ueryLstats s cross apply sys.dmLexecLs)lLtextBs.s)lLhandleC as )t where s.s)lLhandle 0 Qs)lLhandle and s.statementLstartLoffset 0 Qstmtstart and s.statementLendLoffset 0 Qstmtend 5ou can capture the actual blocked statement of a stored procedure in realtime Bas it is occuringC using the followingD create proc spLblockLinfo as select t,.resourceLtype as Rlock typeS ,dbLnameBresourceLdatabaseLidC as RdatabaseS ,t,.resourceLassociatedLentityLid as Rblk obAectS ,t,.re)uestLmode as Rlock re)S """ lock re)uested ,t,.re)uestLsessionLid as Rwaiter sidS """ spid of waiter ,t$.waitLdurationLms as Rwait timeS ,Bselect text from sys.dmLexecLre)uests as r """ get s)l for waiter cross apply sys.dmLexecLs)lLtextBr.s)lLhandleC where r.sessionLid 0 t,.re)uestLsessionLidC as waiterLbatch ,Bselect substringB)t.text,r.statementLstartLoffset:$, Bcase when r.statementLendLoffset 0 ", then lenBconvertBnvarcharBmaxC, )t.textCC T $ else r.statementLendLoffset end " r.statementLstartLoffsetC:$C from sys.dmLexecLre)uests as r cross apply sys.dmLexecLs)lLtextBr.s)lLhandleC as )t where r.sessionLid 0 t,.re)uestLsessionLidC as waiterLstmt """ statement blocked ,t$.blockingLsessionLid as Rblocker sidS "" spid of blocker ,Bselect text from sys.sysprocesses as p """ get s)l for blocker cross apply sys.dmLexecLs)lLtextBp.s)lLhandleC where p.spid 0 t$.blockingLsessionLidC as blockerLstmt
from sys.dmLtranLlocks as t,, sys.dmLosLwaitingLtasks as t$ where t,.lockLownerLaddress 0 t$.resourceLaddress go exec spLblockLinfo B&C Could benefit from more Bor lessC indexes4 9emembering that indexes involve both a maintenance cost and a read benefit, the overall index cost benefit can be determined by comparing reads and writes. 9eading an index allows us to avoid table scans however they do re)uire maintenance to be kept up"to" date. 3hile it is easy to identify the fringe cases where indexes are not used, and the rarely used cases, in the final analysis, index cost benefit is somewhat subAective. The reason is the number of reads and writes are highly dependent on the workload and fre)uency. n addition, )ualitative factors beyond the number of reads and writes can include a highly important monthly management report or )uarterly 2? report in which the maintenance cost is of secondary concern. 3rites of all indexes are performed for inserts, but there are no associated reads Bunless there are referential constraintsC. Gesides select statements, reads are performed for updates and deletes, writes are performed if rows )ualify. ;LT? workloads have lots of small transactions, fre)uently combining select. insert. update and delete operations. 1ata 3arehouse activity is typically separated into batch windows having a high concentation of write activity, followed by an on"line window of read activity. SQL Statement Select nsert @pdate 1elete /ead Yes No Yes Yes 0rite No Yes, all indexes Yes, if row qualifies Yes, if row qualifies
n general, you want to keep indexes to a functional minimum in a high transaction ;LT? environment due to high transaction throughput combined with the cost of index maintenance and potential for blocking. n contrast, you pay for index maintenance once during the batch window when updates occur for a data warehouse. Thus, data warehouses tend to have more indexes to benefit its read"intensive on"line users. n conclusion, an important new feature of SQL Server $%%& includes 1ynamic >anagement 2iews B1>2sC. 1>2s provide a level of transparency that was not available in SQL Server $%%% and can be used for diagnostics, memory and process tuning, and monitoring. 1>2s can be useful in answering practical )uestions such as index usage, cost benefit of indexes, and index hot spots. #inally, 1>2s are )ueriable with S=L=CT statements but are not persisted to disk. Thus they reflect changing server state information since the last SQL Server recycle.

SQL Server Clustered Index Design For Performance

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SQL Server Clustered Index Design For Performance

Uploaded by

Copyright:

Available Formats

SQL Server clustered index design for performance

Customer 1 Bbigint identityC .ame #ieldnE Customer;rder

Customer 1 ;rder 1 ;rders

esigning SQL Server non-clustered indexes for !uery optimi"ation

)o* to maintain SQL Server indexes for !uery optimi"ation

You might also like