You are on page 1of 8

Top 10 Strategies for Oracle Database Performance (Part 1)

by Guy Harrison Everybody loves Top 10 lists, but for a database as complex as Oracle its hard to distill performance tuning best practices into a single book, let alone a list of 10 tips. Nevertheless, in this serious of ToadWorld articles Ill try my best to present the ten performance tuning concepts that I consider most important when optimizing Oracle database performance. At the risk of giving away the ending, here are the top 10 ideas that Ill cover in this series: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Adopt a methodical and empirical approach Design performance into your application architecture and code Index wisely Know your tools Optimize the optimizer Tune SQL and PL/SQL Relieve database contention points Optimize memory to reduce IO Tune IO last but TUNE it! Exploit and optimize RAC

The structure of this series will roughly follow the structure of my new book Oracle Performance Survival Guide, which includes more detail on each of the concepts. In this first installment well focus on the first three ideas. Strategy 1: Adopt and methodical and empirical approach The amazing technical progress achieved by our civilization has arisen primarily because of the adoption of the scientific method and its focus on empirical data driven research. Database performance tuning doesnt have to be as complex as a physics experiment, but it will gain immensely from an methodical and empirical approach. Having a method helps you to approach tuning systematically instead of randomly trying whatever you think of first. There are a couple of Database tuning methods that have proven effective. The most notable of these are: Yet Another Performance Profiling (YAPP) methodology. This methodology was popularized by Anjo Kolk who was instrumental (no pun intended) in introducing wait event interface to the Oracle database. YAPP uses the wait event interface to highlight the areas of the database in most need of tuning. Method-R has been promoted by Carey Millsap, another pioneer of Oracle performance. Method-R uses YAPP techniques but emphasizes focus on specific database transactions that are of most interest. Tuning by Layers was proposed by Steve Adams. Its compatible with both YAPP and Method-R, but tries to separate wait time cause from wait time effect by considering the architecture of the Oracle software stack.

I particularly like Tuning by Layers methodology because it leverages our understanding of Oracle architecture it has a strong theoretical underpinning - and it helps focus on the causes rather than symptoms of poor performance.

The stages of tuning by layers are dictated by the reality of how applications, databases and operating systems interact. At a very high level, database processing occurs in layers as follows: 1. Applications send requests to the database in the form of SQL statements (including PL/SQL requests). The database responds to these requests with return codes and/or result sets. To deal with an application request, the database must parse the SQL; perform various overhead operations (security, scheduling, transaction management) before finally executing the SQL. These operations use operating system resources (CPU & memory) and may be subject to contention between concurrently executing database sessions. Eventually, the database request will need to process (create, read or change) some of the data in the database. The exact amount of data that will need to be processed can vary depending on the database design (indexing for instance) and the application (wording of the SQL for instance). Some of the required data will be in memory. The chance that a block will be in memory will be determined mainly by the frequency with which the data is requested and the amount of memory available to cache the data. When we access database data in memory, its called a logical IO. If the block is not in memory it must be accessed from disk, resulting in real physical IO. Physical IO is by far the most expensive of all operations and consequently the database goes to a lot of effort to avoid performing unnecessary IO operations. However, some disk activity is inevitable.




Activity in each of these layers influences the demand placed on the subsequent layer. For instance, if a SQL statement is submitted which somehow fails to exploit an index, it will require an excessive number of logical reads, which in turn will increase contention and eventually involve a lot of physical IO. Its tempting when you see a lot of IO or contention to deal with the symptom directly by tuning the disk layout. However, if you sequence your tuning efforts so as to work through the layers in order, then you have a much better chance of fixing root causes and relieving performance at lower layers.

Figure 1: The layers of the Oracle software stack. Heres the tuning by layers approach in a nutshell: Problems in one database layer can be caused or cured by configuration in the higher layer. The logical steps in Oracle tuning are therefore: 1. Reduce application demand to its logical minimum by tuning SQL and PL/SQL, and optimizing physical design (partitioning, indexing, etc).

2. 3. 4.

Maximize concurrency by minimizing contention for locks, latches, buffers and other resources in the Oracle code layer. Having normalized logical IO demand by the preceding steps, minimize the resulting physical IO by optimizing Oracle memory. Now that the physical IO demand is realistic, configure the IO subsystem to meet that demand by providing adequate IO bandwidth and evenly distributing the resulting load.

Strategy 2: Design performance into your application architecture and code

Its a sad fact that we spend more time dealing with performance after development than during design. Yet its usually the architecture and design of an application that limits its ultimate performance potential. High performance database application design is a big topic, so Ill only list some of the major considerations here. For more information, refer to chapters 4-6 of Oracle High Performance Tuning, or check out my Oracle Performance by Design article. The Data Model The performance of any database application is fundamentally constrained by its data model. The data model - more than any other factor - determines how much work must be undertaken to satisfy a database request. Furthermore, the data model is the hardest aspect of an application to change once deployed. Even small changes to the data model will typically have ramifications through all layers of the application and there can often be significant downtime when migrating data from the old model to the new model. Data modeling is an enormous topic, but here are some general principles: Start with a normalized data model. Normalization involves eliminating redundancy and ensuring that all data is correctly, completely and unambiguously represented. Use varying length character strings (VARCHAR2) in preference to fixed length (CHAR) strings. Varying length strings use less storage, resulting in less table scan IO unless the data is truly fixedlength. For character data longer than 4000 bytes, choose one of the modern object LOB types, not the legacy LONG data type. If you want the data to be stored outside of the database (in the original files for instance) then use the BFILE type. If the data contains only text then use a CLOB, if binary then use the BLOB type. Allowing columns to be NULL can have significant performance implications. NULLs arent usually included in indexes, so dont make a column NULL if you might want to perform an indexed search to find the null values. On the other hand, NULLs take up less storage than a default value, and this can result in smaller tables which are faster to scan. Consider the mapping of super-types and subtypes; an entity with two subtypes can be implemented in one, two or three tables: the correct decision will depend on the SQL you anticipate.

Figure 2: An entity with subtypes can be implemented as one, two or three tables. Denormalization The normalized data model is be a good starting point, but we often want to introduce redundant, repeating or otherwise non-normalized structures into the physical model to get the best performance. For instance we might: Replicate columns from one table in another to avoid joins. Create summary tables to avoid expensive aggregate queries possibly using materialized views

Vertically partition a table so that long, infrequently accessed columns are stored in a separate table.

Remember, denormalization introduces the risk of data inconsistency and the overhead of maintaining denormalized data can slow down transaction processing. Using triggers to maintain denormalization is a good idea since it centralizes the logic. Partitioning Oracles partitioning option requires separate licensing, but offers many advantages: Some queries may be able to work on a subset of partitions. This partition elimination can reduce IO overhead. Some parallel operations especially DML operations can be significantly faster when partitioning is available. Purging old data can sometimes be achieved by quickly dropping a partition instead of laboriously deleting thousands or millions of rows. Some forms of contention hot blocks and latches can be reduced by splitting up the table across the multiple segments of a partitioned object.

Application Design The way you structure your application code can have a big impact as well: The most optimized SQL is the one you never send. Reduce the amount of SQL you send to the database by caching frequently used data items in memory. Reduce parsing by using bind variables. Reduce network round trips by using array fetch, array insert and stored procedures where appropriate.

Strategy 3: Index wisely

An index is an object with its own unique storage that provides a fast access path into a table. A cluster is a means of organizing table data so as to optimize certain access paths. Indexes and clusters exist primarily to improve performance. Getting the best performance from your database therefore requires that you make sensible indexing and clustering decisions. Oracle provides a wide variety of indexing and clustering mechanisms and each option has merit in specific circumstances, but the three most widely applicable options are the default B*-Tree index, the bitmap index and the hash cluster. In general, the B*-Tree index is the most flexible type and provides good performance across a wide range of application types. However, you might wish to consider alternatives in the following circumstances: Hash clusters can improve access for exact key lookups, though cannot enhance range queries and require careful sizing to prevent degradation as they grow. Hash clusters are also resistant to the latch contention that can be common for busy B*-Tree indexes. Bitmap indexes are useful to optimize queries in which multiple columns of low cardinality are queried in combination. Unlike B*-tree indexes, multiple bitmap indexes can be efficiently merged but can also increase lock contention.

Index Design Index design often involves constructing the best set of concatenated indexes. A concatenated index is simply an index comprising more than one column. Concatenated indexes are more effective merging multiple single column indexes and a concatenated index that contains all of the columns in the WHERE clause will typically be the best way to optimize that WHERE clause. If a concatenated index could only be used when all of its keys appeared in the WHERE clause, then concatenated indexes would probably be of pretty limited use. Luckily, a concatenated index can be used effectively providing any of the initial or leading columns are used. The more columns in the concatenated index, the more effective it is likely to be. You can even add columns from the SELECT list so that the query can be resolved by the index alone. Figure 3 shows how the IO is reduced as columns are added to the concatenated index for a query like this: SELECT FROM WHERE AND AND cust_id sh.customers c cust_first_name = 'Connor' cust_last_name = 'Bishop' cust_year_of_birth = 1976;

Figure 3 The effect of adding relevant columns to a concatenated index Remember, every index adds overhead to DML operations, so only add an index if you are sure that it has a positive effect on performance. Figure 4 shows how DML performance degrades as you add more indexes.

Figure 4 The effect of adding indexes to the performance of a 1,000 row delete.