You are on page 1of 11

Microsoft Technical Champs for Siebel www.siebelonmicrosoft.

com

Technical Note #16: Index Tuning

Technical Note #16: Index Tuning


Last Modified: July 12, 2004
Version 2: For latest version see www.siebelonmicrosoft.com
Area: SQL Server
Siebel Releases: All
Windows Releases All
SQL Server Releases 7.0 and 2000

Index Tuning
This TechNote discusses general methods for tuning indexes for better
performance in Siebel applications using SQL Server.

A Quick Review about SQL Server Indexes


There are two kinds of indexes used in SQL Server, clustered and non-clustered.
A clustered index is the physical order of the data. As such, there can only be
one clustered index per table. With a clustered index, records in the index are
stored in the sequence of the clustered index. A table is not required to have a
clustered index. An example of a clustered index is the white pages of the phone
book. Entries in the yellow pages are listed by category. All “accountants” will be
listed physically together. So the clustered index would contain the occupation
name column. So if you wanted to find all accountants, it would begin at
“ACCOUNTANT” and end at “ACCOUNTANU”.
A non-clustered index is an independent index structure, a binary tree in which
the leaf node has a pointer to the row in the table (data page). An example of a
non-clustered index is the white pages of the phone book. You know the persons
last name and first name. In the white pages, people are listed by name. The
non-clustered index would contain last name and first name and be collated in
that order.
Non-clustered indexes point to the clustered indexes’ key (the cluster key).
Therefore, non-clustered indexes “inherit” the clustered index. You can see this
with DBCC SHOWSTATICS. The cluster key is appended to the non-clustered
index.
Hence, by designing a very good clustered index, your non-clustered index can
be more efficient. Specifically with Siebel, the clustered index is on ROWID.
Since every Siebel query contains ROWID, the non-clustered index will “inherit”
the ROWID column from the clustered index and perform better.
By definition, a table without a clustered index is known as a heap. Heaps can
have non-clustered indexes.
Microsoft Technical Champs for Siebel www.siebelonmicrosoft.com

In general clustered indexes tend to be the most efficient types of indexes for a
number of reasons. For example, if the query causes data to be scanned, then
there are no extra reads from the index page to the data page. See SQL Server
Books Online for a more detailed explanation. Non-clustered indexes tend to be
more efficient for seek type queries and index covering techniques.

Siebel's Approach to Indexes


Siebel’s approach to their application is to give the customers all possible indexes
on the way their application could access the data. To validate this, Siebel tests
their product with a dataset that they feel represents an average customer. The
test is an approximation, and performance will vary due to the customers
individual business requirements.
Each Siebel customer receives the same schema of approximately 2300 Tables,
2200 clustered indexes and 10,500 non-clustered indexes. In an actual
implementation at the customer site, there will be Siebel tables with many
unused columns and Siebel tables where none of the columns appear to be used
for the company's implementation. For Siebel Vertical implementations the
amount of columns increases greatly and customers may have a very large
number of unused columns.
Working with Siebel applications, you will find that there are many more indexes
than your company needs, and these indexes can impact performance in a
positive as well as a negative way.
It's critical for performance tuning that you determine if the indexes are
appropriate for the Siebel implementation by your business.
To determine which indexes you will need and not need, it requires an iterative
and methodical approach. A problem is defined. A base line defined. A series of
tests are conducted. Tuning is performed. The test is re-run, and the results are
measured against the base line to see if progress is made or lost.
In the micro view, the problem could be a single query causing poor response
time or due to its resource usage, it may be slowing down the whole system.
In the macro view of your Siebel system, you need a method for proactively
exercising all the dependencies together. For example, you make one change in
the application or database configuration and the results need to be measured
for implications on other queries or configurations.
The primary vehicle for overall system diagnosis is the stress test. This will be a
suite of scripts that are automated to repeat the same business requirements in
a reproducible manner. This type of stress testing will help you develop a base
line to determine what is acceptable performance, and help you meet your SLA
requirements.
As you tune, you'll find variations in the reasons that the index does or does not
add value for your business. There may be indexes where every column in the
index contains null values. There may be indexes with missing columns. There
may be indexes where there is a poor distribution of the columns in the index
and re-sequencing an existing index can help more.
Microsoft Technical Champs for Siebel www.siebelonmicrosoft.com

Strategies for Tuning Indexes


The rest of this section discusses different situations that can occur with SQL
Server indexes in the Siebel application and how to detect and correct them.
With any index tuning there is a basic approach to take.
Begin by reviewing the index strategies. Decide what improvements to make and
test these changes. Keep in mind that a combination of changes may be possible
for improving performance. Test the index changes. When the changes are
complete, add, drop or modify the index in the development environment and
migrate the change through the testing systems and finally into the production
environment.
Note: a key element to any successful system is rigorous change control. This
will include documentation, a code review, and a source code control
mechanism. Nothing can be worse than slamming a fix into production without
testing it thoroughly. Systems get out of sync and unpredictable performance
results can happen. Two years down the line, during your next upgrade, failure
to document why the change was made will also cause hours of grief.

CAUTION Your business is allowed to remove indexes on the EIM_* tables. Adding or changing a base
table index requires Siebel approval. Consult with Siebel Expert Services when you need to add or change
a base table index. Siebel can detect if base table indexes have changed by running the utility dbchck which
compares the Siebel data dictionary to the SQL Server data dictionary. If changes have been made without
Siebel approval, your warranty may be voided.

Tools
There are a number of tools that help with performance tuning and optimizing.
Profiler: SQL Profiler gives a SQL Server kernel view of the query. It is a
powerful tool, and should be used carefully because it can use a lot of CPU. The
results should be saved to a file. This file can be created as a table and analyzed.
Filter on the SPID or CLIENT PROCESS.
Trace: Level 8 EIM Trace shows network times. It does not display plans, and
does not always show the hints used.
Index Tuning Wizard: The Index Wizard will not suggest better indexes when
it is used with Siebel databases. Siebel builds indexes for nearly every possible
configuration. It does show what indexes are being used. With this information
you can deduce which indexes are not being used.
Query Analyzer: SQL Query Analyzer gives a SQL Server kernel view of the
query and can be run from the command prompt with isqlw.exe.
There are several tools within the SQL Query Analyzer that are useful:
• SET STATISTCIS PROFILE shows the plans with their costs
• SET STATISTICS IO shows the only the io.
• DBCC DBREINDEX reorganizes indexes. See the section on fragmentation
below for a query that generates DBCC commands.
Microsoft Technical Champs for Siebel www.siebelonmicrosoft.com

• DBCC SHOW_STATISTICS shows which indexes contain only one value


which indicates that the index is probably not being used.
• The Transact-SQL command UPDATE STATISTICS updates information
about the distribution of key values in a table. Use the FULLSCAN option
to use all rows in the table and the SAMPLE option for a sample of rows in
the table.

Tables without Clustered Indexes


Poor query performance can sometimes be traced back to tables without a
clustered indexed. The following query example returns the tables without
clustered indexes. After running this script, compare the result to data mapping
for your specific application and the ER diagram. If a base table does not have a
clustered index, you will need to consult with Siebel Expert Services.
drop table #all_idx
go
-- make a temp table with all Siebel tables and indexes
SELECT so.name, si.indid
-- count(*)
into #all_idx
from sysobjects so, sysindexes si
where
so.name like 'S_%' and
so.id = si.id and
so.type = 'U'
-- and
go

-- si.indid = 1

--867 tables with clustered index

select count(*)
from sysobjects
where
name like 'S_%'and
type = 'U'

-- 1172 tables

select 1172 - 867

-- 305 tables DO NOT have clustered indexes

-- remove every thing out of the temp table that is not a clustered index

delete from #all_idx where indid < 1


delete from #all_idx where indid > 1
Microsoft Technical Champs for Siebel www.siebelonmicrosoft.com

-- this query finds all indexes that do NOT have a clustered


-- index

select *
from
(
(
#all_idx as a right outer join sysobjects as so
on
-- outer join and match ones that DO and DO NOT match into one
-- result set
a.name=so.name)
)
where
-- just give me tables
so.type = 'U' and
-- just pull out the siebel tables
so.name like 'S_%' and
-- just show me the ones without a clustered index. --- it "is null"
-- because it does not exist in the temp table.
a.name is NULL
order by
-- sort by name
so.name

Unique and Non Unique Indexes


When you find a non-clustered index that is not unique and the index is heavily
used, add either ROW_ID, MS_IDENT or GUID column at the end of the index to
make it unique.
SQL Server makes clustered indexes unique by putting GUID on the index if
needed.
Every Siebel query has ROW_ID which is the reason this column is in the
clustered index on all base tables. EIM often uses MIN(ROW_ID) in a subquery
(or subselect). Adding ROW_ID to a non-clustered index will save i/o on the
plan. The following example shows a query with this problem, and how to create
an index that improves performance:
' /*
SQL User Name CPU Reads Writes Duration Connection ID SPID Start Time
SADMIN 91578 137615 54 91983 9472 39 12:01:22.827

UPDATE dbo.S_INSITM1_FN_IF
SET T_INS_ITEM__UNQ = 'Y'
FROM dbo.S_INSITM1_FN_IF T1
WHERE (T_INS_ITEM__EXS = 'Y' AND
(SELECT MIN(ROW_ID)
FROM dbo.S_INSITM1_FN_IF T2
WHERE (T2.T_INS_ITEM__EXS = 'Y' AND
Microsoft Technical Champs for Siebel www.siebelonmicrosoft.com

T2.T_INS_ITEM__RID = T1.T_INS_ITEM__RID AND


T2.IF_ROW_BATCH_NUM = 12110004 AND
T2.IF_ROW_STAT_NUM = 0 AND
T2.T_INS_ITEM__STA = 0)) AND
IF_ROW_BATCH_NUM = 12110004 AND
IF_ROW_STAT_NUM = 0 AND
T_INS_ITEM__STA = 0)
go

*/

SET STATISTICS IO OFF


GO
SET STATISTICS PROFILE OFF
GO

select T_INS_ITEM__UNQ = 'Y'


FROM dbo.S_INSITM1_FN_IF T1
WHERE (T_INS_ITEM__EXS = 'Y' AND
ROW_ID =
(SELECT MIN(ROW_ID)
FROM dbo.S_INSITM1_FN_IF T2
WHERE (T2.T_INS_ITEM__EXS = 'Y' AND
T2.T_INS_ITEM__RID = T1.T_INS_ITEM__RID AND
T2.IF_ROW_BATCH_NUM = 12110004 AND
T2.IF_ROW_STAT_NUM = 0 AND
T2.T_INS_ITEM__STA = 0)) AND
IF_ROW_BATCH_NUM = 12110004 AND
IF_ROW_STAT_NUM = 0 AND
T_INS_ITEM__STA = 0)
go

sp_help S_INSITM1_FN_IF

/*

S_INSITM1_FN_IF_U1
clustered, unique located on SIEB_DATA
T_INSITEMCO_INSITE, T_INSITEMCO_CONTAC, IF_ROW_STAT_NUM, T_INSITEMCO__STA, MS_IDENT

S_INSITM1_FN_IF_original_U1
non-clustered, unique located on SIEB_IND
IF_ROW_BATCH_NUM, ROW_ID

*/

SELECT COUNT(DISTINCT(T_INS_ITEM__EXS)) FROM S_INSITM1_FN_IF


SELECT COUNT(DISTINCT(T_INS_ITEM__RID)) FROM S_INSITM1_FN_IF
SELECT COUNT(DISTINCT(IF_ROW_BATCH_NUM)) FROM S_INSITM1_FN_IF
SELECT COUNT(DISTINCT(IF_ROW_STAT_NUM)) FROM S_INSITM1_FN_IF
Microsoft Technical Champs for Siebel www.siebelonmicrosoft.com

SELECT COUNT(DISTINCT(T_INS_ITEM__STA)) FROM S_INSITM1_FN_IF

DROP INDEX S_INSITM1_FN_IF.S_INSITM1_FN_IF_FEM000

CREATE INDEX S_INSITM1_FN_IF_FEM000 ON


S_INSITM1_FN_IF
(T_INS_ITEM__RID,ROW_ID, T_INS_ITEM__EXS,
IF_ROW_BATCH_NUM, IF_ROW_STAT_NUM,
T_INS_ITEM__STA )
ON SIEB_IND

sp_help S_INSITM1_FN_IF

/*

S_INSITM1_FN_IF_U1
clustered, unique located on SIEB_DATA
T_INSITEMCO_INSITE, T_INSITEMCO_CONTAC, IF_ROW_STAT_NUM, T_INSITEMCO__STA, MS_IDENT

S_INSITM1_FN_IF_FEM000
non-clustered, unique located on SIEB_IND
T_INS_ITEM__RID, ROW_ID, T_INS_ITEM__EXS, IF_ROW_BATCH_NUM, IF_ROW_STAT_NUM,
T_INS_ITEM__STA

S_INSITM1_FN_IF_original_U1
non-clustered, unique located on SIEB_IND
IF_ROW_BATCH_NUM, ROW_ID

*/

drop index S_INSITM1_FN_IF.S_INSITM1_FN_IF_U1


go

CREATE unique clustered INDEX S_INSITM1_FN_IF_U1 ON


S_INSITM1_FN_IF
(T_INSITEMCO_INSITE, T_INSITEMCO_CONTAC,
IF_ROW_STAT_NUM, T_INSITEMCO__STA, MS_IDENT )
ON SIEB_DATA
go

drop index S_INSITM1_FN_IF.S_INSITM1_FN_IF_FEM000


go

CREATE unique INDEX S_INSITM1_FN_IF_FEM000 ON


S_INSITM1_FN_IF
(T_INS_ITEM__RID, ROW_ID, T_INS_ITEM__EXS, IF_ROW_BATCH_NUM,
IF_ROW_STAT_NUM, T_INS_ITEM__STA )
ON SIEB_IND
go
Microsoft Technical Champs for Siebel www.siebelonmicrosoft.com

Index Covering
Index covering is a term used when an index contains all the columns in the
ORDER BY and WHERE clauses from a query. With index covering, the query can
find all columns needed to satisfy the query in the index page and it does not
have to access the data page.
Here is an example of a query that can be improved with index covering:
SET STATISTICS IO ON
GO
SET STATISTICS PROFILE ON
GO

Set
SELECT MIN(A)
FROM TAB_0
WHERE
X=’12’ AND
Y=’M’ AND
Z > 58

Select count(1) from TAB_0


---------
582959
Select count(distinct(X)) “no rows” from TAB_0
No rows
-----------
582938
Select count(distinct(Y)) “no rows” from TAB_0
No rows
-----------
2
Select count(distinct(Z)) “no rows” from TAB_0
No rows
-----------
3413

The current index is:


Create index TAB_0_NC0 on TAB_0(Y,Z)

These queries show that the index is not selective. You will also see this in DBCC
SHOW_STATISTICS as well.
Column X has the most distinct values in the query, (582,938 in a table with
582,959 rows), but it doesn't appear in the index.
To create an index with index covering:
create index TAB_0_NC0 on TAB_0(X,Z,Y,A)

Column X is first in the index so that SQL Server can seek the information rather
than scanning the index for a range of values and then going to the data page
and retrieving the remainder of the result set.
Microsoft Technical Champs for Siebel www.siebelonmicrosoft.com

Column Z is second in the index because it is more selective than Y. Adding


column A to the index is what causes index covering. If the query is frequently
executed, then the query finds all the information needed in the index page and
will not need to access the data page.
Index covering is not always needed. If the query was not frequently executed,
this could be an acceptable index:
create index TAB_0_NC0 on TAB_0(X,Z,Y)

The High Cost of Null Indexes


With a Siebel application, it is highly likely that you will find indexes with column
values that are all nulls. As stated earlier, this is because Siebel does not know
how the customer will implement all the features. As with any index, these
structures occupy physical space on your disk, and more importantly, can impact
performance.
Indexes help on SELECT statements, but cause penalties on INSERT, UPDATE or
DELETE statement. This is especially important when initially loading data (100%
INSERTS), but less of a performance impact during normal daily transactions
where usually 90%+ of your SQL Server is performing SELECTS.
In a mature production environment, the concern for 100% NULL indexes is an
operational concern. Specifically, it’s dead space in the database that has to be
backed up. This means you need to buy more tapes and the backups run longer.
Backups are heavy IO to the disk subsystem, so the longer they run, the more
intrusive they will be to your online operations. As such, it’s important to prune
off non-used 100% NULL indexes.
The case for keeping 100% NULL indexes can also be made in rare circumstance.
For example, if the following query is being run:
select count(*) from table_x

It could very easily be “cheaper” for the database to scan the 100% NULL non-
clustered index than to table scan all the data pages of the table.
Generally speaking, 100% NULL indexes will probably never be used, but there
are always exceptions to the rule. Set a bench mark before removing the 100%
NULL index, then use SQL Server Profiler to help diagnose long running queries
and see if it’s needed to put the 100% NULL index back.
Since many indexes could be 100% NULL, you need to focus on just the large
tables that may yield the best results.
To find the tables with the most rows, run the following query:
Select top 100 object_name(id), rows
From sysindexes
Where object_name(id) like ‘S_%’
Order by rows desc
Go

This query will show you the top 100 Siebel tables with the most rows.
To find which indexes are 100% NULL:
UPDATE STATISTICS on the table with FULL SCAN
Microsoft Technical Champs for Siebel www.siebelonmicrosoft.com

Run SP_HELPINDEX <table_name>

Look at the DBCC SHOWSTATISTICS results for each index.


If the value for DBCC SHOWSTATISTICS is “1.0” that means that 100% of the
values in that column are NULL.

Note: See Books Online for specific syntax on the DBCC and SP_ commands

To confirm this, run:


Select count(*) from table_x
Go
Select count(column_y) from table_x where NAME IS NULL
Go

If both results are the same, then you know that every value in that column is
NULL.

Find Unnecessary Indexes


The Index Wizard can help you determine which indexes are not being used. The
Index Wizard does not tell you this information directly. It can show you which
indexes are being used (and may take a long time to complete), and you can
then deduce which indexes are not being used.
The Index Wizard will not suggest better indexes when it is used with Siebel
databases because Siebel builds indexes for nearly every possible configuration.

Remove Unnecessary Indexes


Working with Siebel applications, you will find that there are many more indexes
than your company needs, and that these indexes are impacting performance.
Siebel sells to a wide variety of businesses, and creates many indexes to
compensate for all business scenarios.
This needs to be fleshed out
To identify these indexes, run the trace file through the Index Wizard
Siebel hints in the Siebel EIM Trace File
What you are looking for are indexes where all the columns are nulls. These
indexes can be dropped.
The following query generates a list of all the statistics on Siebel indexes. Look
for indexes with poor distribution, 100% NULL for example.
-- DBCC SHOW_STATISTICS (authors, UPKCL_auidind)
-- GO

dbcc show_statistics(s_fn_need, 2)

-- make the statement to show distribution...

select 'dbcc show_statistics ('+object_name(id)+','+name+')'+char(13)+'go'+char(13)


from sysindexes si
Microsoft Technical Champs for Siebel www.siebelonmicrosoft.com

where
object_name(si.id) like 'S_%' and
rows > 2000
-- order by object_name(si.id)
order by rows
go

Remove Indexes that are never used


It is likely that there are also indexes with columns of data, but the index itself is
never used, and can be dropped.
These indexes will be harder to find. SQL Profiler will not find these indexes. You
may need additional information about the Siebel application and how your
company uses the Siebel application. You may need advice from the business
experts on the project and Siebel experts.

Add or Change Indexes to Eliminate Table Scans


When tuning queries in detail you should always identify queries that are doing
table scans. The performance solution for this problem is to either improve the
index used by the query or add a new index. When a query performs a table
scan, it is probably because the table contains duplicates. Duplicates are bad for
performance. Queries will perform better against a table that has at least one
column that provides uniqueness. You can add uniqueness by adding Identity,
GUID, or ROW_ID to a table.

Supplemental Reading
TechNote 6 - Index Creation Performance During Data Loading
SQL Server Books-On-Line

You might also like