Professional Documents
Culture Documents
Document Control
Amendment Record
Version 0.1 Date 09-Mar-2011 Status Draft Description of Revision Shailendra Hirlekar
Abbreviations
Term AWR FTS PGA SGA AMM Definition Automatic Workload Repository Full Table Scan Program Global Area System Global Area Automatic Memory Management
0.2
Page 2 of 18
Table of Contents
0.2
Page 3 of 18
1.2 Scope
This document describes how to utilize AWR reports to tune, Physical database instance Identify SQLs which are performing poorly
0.2
Page 4 of 18
Provides information about the database, 1. Name of the database 2. Database ID 3. Instance Name and Instance Number 4. Oracle Database Server release version 5. Whether database is part of RAC? 6. Machine name hosting the database instance
Snap Time 31-Jan-11 17:00:07 31-Jan-11 18:00:04 59.96 (mins) 820.16 (mins)
Before we go through this table, we will see, 2.1.1 How AWR reports are generated?
Starting from Oracle 10g onwards, at every 60 mins database server takes snapshot of a database. Snapshot gathers database health related information at that point of time. This information is stored in Oracle data dictionary and servives database restarts. Each snapshot is identified by a snapshot ID. In order to generate AWR reports, 2 snapshot IDs are needed. Latest of these snapshots is compared with old snapshot and based on the comparison an AWR report gets generated. Above table provides information about, 1. Begin Snapshot ID and End Snapshot ID along with date and time details for the snapshot 2. The Sessions column indicates total no. of database sessions active when the snapshot was taken. 3. Elasped Time column indicates the duration in mins between 2 snapshots. In this example, the duration is approx. 60 mins 4. DB Time columns represents, total time spent by foreground processes. It includes CPU Time, IO Time and Wait time and excludes Idle wait time and time taken by background processes. (DB time) / (Elapsed time) gives total no. of database active sessions during that period. In case of above scenario total no. of active sessions are, 820.16 (DB Time) / 59.96 (Elapsed Time), which is 13.67 sessions.
0.2
Page 5 of 18
The section gives information about size details for memory componets 1. 2. 3. 4. Buffer Cache Shared Pool Standard Block Size Log Buffer
From Oracle 10g onwards, database server does Automatic Memory Management (AMM) for PGA and SGA components. Based on load, database server keeps on allocating or deallocating memory assigned to different components of SGA and PGA. Due to this reason, we can observe different sizes for Buffer Cache and Shared Pool, at the beginning or end of AWR snapshot period.
This section provides information Per Second and Per Transaction basis. For example it shows, 1. 2. 3. 4. How many Hard Parse are happening How much Redo is generated How many Physical Reads are happening etc How many Sorts are happening
This information is proportionate to the amount of load experienced by the database server during that period. 2.3.1 How load profile information is helpful during load testing?
Ans: Load testing for application release is done to assess the release performance. Load test is first done on baseline version and then performed on the release version. It is recommended to generate AWR reports for baseline as well as for the release load test. In order to assess performance of release test against baseline test, we can compare the load profile section of AWR for both the tests. This comparison will give slight idea about the release performance. If there are significant differences in the count of baseline and release load profile figures, then that could give some pointers for investigation in case the release load test is bad in terms of performance. 2.3.2 Is it ok to keep the snapshot frequency to default i.e. every 60 minutes?
To analyse the performance issues in detail, it is recommended to generate AWR reports with interval less than 60 min. May be one can generate AWR reports at every 10 minutes only during the load test interval, and later change the frequency to default one. 0.2 Page 6 of 18
The section provides information about hit ratios for different memory components. These ratios, tells how often a particular data is found in a respecitive memory structure. Details about hit ratios are as mentioned delow, Buffer Hit: Shows % of times particular block was found in buffer cache, instead of reading it from disk. Buffer Nowait: Indicates % of times data buffers were accessed directly without any wait time. Library Hit%: Shows % of times SQL and PL/SQL found in shared pool. In-Memory Sort %: Shows % of times Sorting operations happened in memory than in the disk (temporary tablespace). Soft parse %: Shows % of times the SQL in shared pool is used. Latch Hit %: Shows % of time latches are acquired without having to wait. As per Oracle AWR report the target should be 100% for these ratios. But in reality this is not possible always. Hence the ratio above 80% is always healthy.
% SQL with executions>1: Shows % of SQLs executed more than 1 time. The % should be very near to value 100. % memory for SQL w/exec>1: From the memory space allocated to cursors, shows which % has been used by cursors more than 1. The ratio above 80% is always healthy.
This section shows TOP 5 wait events the processes were waiting on during the snapshot time period. These events are helpful during analysis of any database related performance bottlenecks. The Wait Class, column helps in classifying whether the issue is related to application or infrastructure.
0.2
Page 7 of 18
Analysing Oracle AWR reports Waits, column provides information about no. of wait happens. Time(s), column provides information about total CPU time in seconds spent on the wait.
Important wait events and their causes are explained in detail below, 2.6.1 db file scattered read:
This event indicates wait due to full table scans or index fast full scans. To avoid this event, identify all the tables on which FTS is happening and create proper indexes so that oracle will do Index scans instead of FTS. The index scan will help in reducing no. of IO operations. To get an idea about tables on which FTS is happening please refer to Segment Statistics -> Segments By Physical Read section of AWR report. This section lists down both Tables and Indexes on which Physical Reads are happening. Please note that physical reads doesnt necessarily means FTS but a possibility of FTS. 2.6.2 db file sequential read:
The event indicates that index scan is happening while reading data from table. High no. of such event may be a cause of unselective indexes i.e. oracle optimizer is not selecting proper indexes from set of available indexes. This will result in extra IO activity and will contribute to delay in SQL execution. Generally high no. is possible for properly tuned application having high transaction activity. 2.6.3 buffer buzy wait:
Indicates that particular block is being used by more than one processes at the same. When first process is reading the block the other processes goes in a wait as the block is in unshared more. Typical scenario for this event to occur is, when we have batch process which is continuously polling database by executing particular SQL repeatedly and there are more than one parallel instances running for the process. All the instances of the process will try to access same memory blocks as the SQL they are executing is the same. This is one of the situation in which we experience this event. 2.6.4 enq: TX - row lock contention:
Oracle maintence data consistency with the help of locking mechanism. When a particular row is being modified by the process, either through Update/ Delete or Insert operation, oracle tries to acquire lock on that row. Only when the process has acquired lock the process can modify the row otherwise the process waits for the lock. This wait situation triggers this event. The lock is released whenever a COMMIT is issued by the process which has acquired lock for the row. Once the lock is released, processes waiting on this event can acquire lock on the row and perform DML operation.
0.2
Page 8 of 18
Important statistics here is the DB Time. The statistic represents total time spent in database calls. It is calculated by aggregating the CPU time and wait time of all sessions not waiting on idle event (non-idle user sessions). Since this timing is cumulative time for all non-idle sessions, it is possible that the time will exceed the actual wall clock time. Objective of tuning oracle database is to reduce the amount of time users spend in performing some action on the database. This time represents time taken by foreground sessions and not background sessions. In above example, DB Time is 49,209.52 seconds. 91% of the time is being spent on SQL execution elapsed time i.e for SQL execution. DB CPU represents time spent on CPU resource by foreground user processes. This time doesnt include waiting time for CPU. DB CPU is contributing to 22% of total DB time. Important thing to note here is, the actual wall clock time is around 3600 seconds (difference between 2 snapshots) but the DB CPU shown here is 10,932 seconds. If the server machine (on which database server is running) has more than 1 CPUs, it is possible to have DB CPU greater than actual wall clock time. In this example the database server is hosted on machine with 8 CPUs. Hence a 1 second is divided into 8 parts. Hence DB CPU 10,932 seconds means, 10,932 (DB CPU) / 8 (CPU) = 1366 wall clock seconds. The Parse time elapsed and Hard parse elapsed time has taken around 17% and 15% of the total DB time. Parse time elapsed represents time spent for Syntax and Semantic checks. Whereas Hard parse include time represents time spent for Syntax and Semantic checks PLUS time spent for optimizing the SQL and generating optimizer plan.
The events are sorted in descending order of Total Wait Time (s) column. There are over 800 distinct wait events. Oracle has grouped these wait events in 12 wait classes. These wait classes are further divided in 2 categories, Administrative Wait Class and Application Wait Class. These wait classes gives overall information about whether the waits happening for Application or for System events. In the example above first 2 rows show that total wait time is higher for Concurrency and User I/O wait classes. Though we wont have much control on reducing concurrency, but we could aim at reducing the User IO. High User IO means, From the pool of available indexes proper indexes are not being used FTS is happening on big tables with millions of rows
The wait events are sorted on Total Wait Time (s) column in descending order. The idle events are listed down in the end. First 10 to 15 events should be looked into because rest of the events are idle events and can be ignore. These events are related to foreground processes.
0.2
Page 10 of 18
The wait events are sorted on Total Wait Time (s) column in descending order. The idle events are listed down in the end. First 10 to 15 events should be looked into because rest of the events are idle events and can be ignore. These events are related to background processes.
Elapsed Time CPU Time Elap per Exec % Total DB Executions (s) (s) (s) Time 7,678 1,202 905 88 59 55 54 43 37 35 503 97 89 11 11 2 16 5 10 2 20,391 1 0.00 35.01 2 57,158 27.37 0.00 1,004 1 70 41,246 7.65 1202.04 12.93 0.00
SQL Id
SQL Module
SQL Text
15.6093pczysczy6ut
2.447pusny3989p7b XXXXXXXX declare XXX_XXXX XXX_XXX... 1.84ftz3ja1zr82s5 0.186qz82dptj0qr7 0.122k3bu9k1p7kyz XXXXXXXX select ID , TO_CHAR(XXXX,... select l.col#, l.intcol#, l.... select /*+ INDEX(ol$ ol$signat...
0.119pwx7hnjcgwyh XXXXXXXX select xxxx.xxxxxxx... 0.11cqgv56fmuj63x 0.099g485acn2n30m 0.0839m4sx9k63ba2 0.0777zqh2uy696by XXXXXXX select owner#, name, namespace... select col#, intcol#, reftyp, ... select /*+ index(idl_ub2$ i_id... SELECT DISTINCT XX.XXXXXX...
The section provides TOP SQLs sorted in descending order by, Elapsed Time(s). Elapse per Exec (s): Elapse time in seconds for per execution of the SQL. Executions: Total no. of executions for the SQL during the two snapshot period. Elapsed Time: This is calculated by multiplying Elapse per Exec (s) by Executions. % Total DB Time: % of DB time utilized by the SQL.
0.2
Page 11 of 18
Analysing Oracle AWR reports SQL Module: Provides module detail which is executing the SQL. Process name at the OS level is displayed as SQL Module name. If the module name starts with any of the names given below, then dont consider these SQLs for tuning purpose as these SQLs are oracle internal SQLs, DBMS sqlplusw TOAD rman SQL Enterprise Manager ORACLE MMON_SLAVE emagent etc
This section is very important in terms of tuning perspective as it provides information about SQLs which need tuning. During load testing one should compare this section for Release load test and with baseline load test. This comparison will give difference about which SQLs performed badly during baseline and release load test. Looking at the column, Elapse per Exec (s), one can understand which SQLs need tuning. Normally SQLs taking time above 1 second are candidate for tuning. For other sections like, SQL ordered by Gets, SQL ordered by CPU Time etc, the SQLs appearing under SQL ordered by Elapsed Time are listed most of the time. Hence this section is worth investigating. Below is the brief description about each section, SQL ordered by CPU Time: SQLs are listed based on CPU time taken and are ordered by, CPU Time (s) column. SQL ordered by Gets: SQLs are listed based on Buffer Reads (Buffer Gets) for the SQL and are ordered by, Buffer Gets, column. SQL ordered by Reads: SQLs are listed based on Physical Reads (Disk Reads) for the SQL and are ordered by, Physical Reads, column. SQL ordered by Executions: SQLs are listed based on total no. of executions for the SQL and are ordered by, Executions, column.
For each of these sections, to get the complete SQL Text details, click on the SQL ID listed under SQL ID column.
This section provides information about how many Redo Log Switches are happening. The Total column provides information for log switches during the snapshot period and per hour column provides information per Hour basis. 2 log switches per hour is optimal value as per oracle standards.
0.2
Page 12 of 18
The section provides estimates on, how the increase or descrease of buffer cache size will casue decrease or increase in physical reads. This information is just an estimated data and not an actual data. Starting point here is Size Factor = 1.0. This gives current memory allocation for Buffer Cache. In this example, 728 MB is being allocated to buffer cache. With this setting the estimated amount of Physical Reads are 676,024,586. In case we increase the memory allocation for buffer cache to say 1440 MB (Size Factor = 1.98) then estimated physical reads will be 347,627,804. This means by allocating additional 712 MB for Buffer Cache, total estimated physical reads will come down by 328,396,782. On the other hand, by reducing Buffer Cache to say 216 MB (Size Factor = 0.30) estimated physical reads increase to 1,038,346,123. The statistics acts as an input to DBA in order to tune the Buffer Cache memory.
0.2
Page 13 of 18
Analysing Oracle AWR reports Similar to Buffer Pool Advisory, the statistic provides information on how the increase or decrease in PGA memory will cause increase or decrease in Estd PGA Cahce Hit %. Starting point here is Size Factor = 1.0. This gives current memory allocation for PGA. In this example 3072 MB is being allocated to PGA. With this allocation the Estd PGA Cahce Hit % is 97, which is good. Hence even if we increase PGA to 3686 MB we will get 2% increase in Estd PGA Cahce Hit %. Hence it wont be advisable to increase PGA further.
Similar to Buffer Pool Advisory and PGA, the statistic provides information on how the increase or decrease in Shared pool memory will cause increase or decrease in Estd LC Load Time (s). Starting point here is SP Size Factor = 1.0. This gives current memory allocation for shared pool. In this example 280 MB is being allocated to shared pool. With this allocation the Estd LC Load Time (s) is 57,751. If we increase the shared pool size to 308 then Estd LC Load Time (s) will come down to value 1. Hence shared pool should be set to 308 MB.
The statistic provides information on how the increase or decrease in SGA memory will cause decrease or increase in Estd Physical Reads. Starting point here is SGA Size Factor = 1.0. This gives current memory allocation for SGA. In this example 1024 MB is being allocated to SGA. With this allocation the Estd Physical Reads is 844,082,787. If we increase the SGA size by 1024 MB i.e. to 2048 MB then Estd Physical Reads will come down by 328,412,156. Since there is 50% reduction in Estd Physical Reads, the SGA should be increased to 2048 MB.
0.2
Page 14 of 18
45 15/15.25
The statistic proivides information about UNDO segments. Min/MAX TR (mins) Represents Minimum and Maximum Tuned Retention Minutes for Undo data. This data will help to set the UNDO_RETENTION database parameter. In this example this parameter can be set to 15.25 minutes i.e. 915 seconds. Max Qry Len(s) Represents Maximum query length in seconds. In this example the max query length is 135 seconds. STO/ OOS Represents count for Sanpshot Too Old and Out Of Space errors, occurred during the snapshot period. In this example, we can see 0 errors occurred during this period.
XXX_INDEX XXX_XXXX_IX_05
The statistic displays segment details based on logical reads happened. Data displayed is sorted on Logical Reads column in descending order. It provides information about segments for which more logical reads are happening. Most of these SQLs can be found under section SQL Statistics -> SQL ordered by Gets.
The statistic displays segment details based on physical reads happened. Data displayed is sorted on Physical Reads column in descending order. It provides information about segments for which more physical reads are happening. Queries using these segments should be analysed to check whether any FTS is happening on these segments. In case FTS is happening then proper indexes should be created to eliminate FTS. Most of these SQLs can be found under section SQL Statistics -> SQL ordered by Reads.
0.2
Page 15 of 18
The statistic displays segment details based on total Row lock waits which happened during snapshot period. Data displayed is sorted on Row Lock Waits column in descending order. It provides information about segments for which more database locking is happening. DML statements using these segments should be analysed further to check the possibility of reducing concurrency due to row locking.
Whenver a transaction modifies segment block, it first add transaction id in the Internal Transaction List table of the block. Size of this table is a block level configurable parameter. Based on the value of this parameter those many ITL slots are created in each block. ITL wait happens in case total trasactions trying to update same block at the same time are greater than the ITL parameter value. Total waits happening in the example are very less, 23 is the Max one. Hence it is not recommended to increase the ITL parameter value.
Buffer busy waits happen when more than one transaction tries to access same block at the same time. In this scenario, the first transaction which acquires lock on the block will able to proceed further whereas other transaction waits for the first transaction to finish. If there are more than one instances of a process continuously polling database by executing same SQL (to check if there are any records available for processing), same block is read concurrently by all the instances of a process and this result in Buffer Busy wait event.
26.05 3,143,888
0.2
Page 16 of 18
Analysing Oracle AWR reports Pct Misses should be very low. In the example Pct Misses are above 10 for some of the library cache components. This indicates that the shared pool is not sufficiently sized. In case AMM (Automatic Memory Management) is used then the DBA can increase the SGA component. In case AMM is not used then increase the SHARED_POOL memory component.
0.2
Page 17 of 18
5 References
Oracle 10g documentations Documents availanle on Oracle Support Site (metalink.oracle.com)
About Author: Shailendra S. Hirlekar is currently working with Infosys Technologies Ltd. as a Senior Technology Architect. He is expertise in Oracle Database Administration. He is having 12+ years of IT experience.
0.2
Page 18 of 18