P. 1
awr

awr

|Views: 78|Likes:
Published by Kiran Kumar Reddy

More info:

Published by: Kiran Kumar Reddy on May 06, 2012
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as DOCX, PDF, TXT or read online from Scribd
See more
See less

05/06/2012

pdf

text

original

A Tour of the AWR Tables

Dave Abercrombie, Principal Database Architect, Convio © 2008 Northern California Oracle Users' Group (NoCOUG) Summer Conference 2008, August 21 2008

Introduced in version 10g, Oracle's Automatic Workload Repository (AWR) provides diagnostic information for performance and scalability studies, automatically recording a rich variety of database performance statistics. What's the best way to leverage this wealth of data? While you can run Oraclesupplied AWR reports, or use Oracle features such as the Automatic Database Diagnostic Monitor (ADDM), each Oracle database presents its own unique tuning challenges. In this paper, you'll learn how to work directly with AWR tables, using customized queries to improve insight into your own particular scalability issues. Topics include:
      

Important AWR tables, their contents, how to join them, and their quirks and limitations. Sample queries that can be easily adapted to focus on your own unique set of problems. Estimating the "Average Active Session" metric. Simple statistical techniques to find spikes and other types of anomalous behavior. A comparison of techniques used for historical scalability studies with those used for real-time performance crisis resolution.. Use of DBMS_APPLICATION_INFO and JDBC end-to-end metrics. Useful tips on configuring AWR.

This paper also applies some industrial and quality engineering approaches recently described by Robyn Sands to the use of AWR tables. These ideas are also combined with use of the DB time metric championed by Kyle Hailey. I show below the outline for this paper, and a Microsoft PowerPoint version is also available.

AWR Overview o What is AWR?

  

 

Why use AWR? Ways of using AWR o AWR settings Find time ranges of load spikes: Average Active Sessions (AAS) o AAS definition and usefulness o Elapsed time denominator and non-uniformity (aka "skew") o AAS exact calculation (not very useful) o AAS estimation methods (very useful, based on Active Session History ASH) o Example scenario Find specific problem SQLs: Sort by aggregated statistics o Example scenario Find specific problem SQLs: Non-uniform statistics o Example scenario Characterize a problem SQL's behavior over time o Example 1 scenario o Example 2 scenario o Example 3 scenario o Example 4 scenario Selected AWR tables o DBA_HIST_SNAPSHOT o DBA_HIST_SQLSTAT o DBA_HIST_SYSSTAT o DBA_HIST_SEG_STAT o DBA_HIST_SEG_STAT_OBJ o DBA_HIST_SQLTEXT o DBA_HIST_ACTIVE_SESS_HISTORY Scripts o Find time ranges of load spikes: Average Active Sessions (AAS)  aas-per-hour.sql (AWR)  aas-per-min.sql (ASH)  aas-per-min-awr.sql (AWR)  aas-exact.sql (AWR) o Find specific problem SQLs: Sort by aggregated statistics  find-expensive.sql (AWR) o Find specific problem SQLs: Non-uniform statistics  high-var-sql.sql (AWR) o Characterize a problem SQL's behavior over time  sql-stat-hist.sql (AWR) Conclusion References
o o

AWR Overview
What is AWR? The Automatic Workload Repository (AWR) takes "snapshots" of database performance statistics periodically, and records these data for later review. It is covered by Oracle's Diagnostic Pack License. These AWR data are available in about 80 "tables" whose names begin with "DBA_HIST". Oracle uses the AWR data internally for its self-tuning feature called Automatic Database Diagnostic Monitor (ADDM), which includes a variety of reports and other tools. The ADDM will not be discussed further in this paper. Instead, this paper will focus on interactive use of custom queries using the AWR DBA_HIST tables. The AWR records performance data from several perspectives, some of which are shown below. This paper primarily focuses on the SQL perspective, however the techniques presented here can be easily adapted to other perspectives.
      

SQL statement System Segment Wait events SGA Enqueue locks etc.

Why use AWR? AWR is not used for real-time performance monitoring like the V$ tables. Instead, it is used for historical analysis of performance. AWR complements, but does not replace, real-time monitoring. Example uses include the following:

Databases that you rarely, if ever, review DBAs are often responsible for more databases than they can watch closely at all times. Consultants are often called into to diagnose and fix a database they have never seen. A company might acquire another company's databases, suddenly becoming responsible for them. A DBA might not have instant access to a suffering database. In all these cases, AWR can provide vital data that is otherwise not available.

the first step in an investigation is to determine the specific time ranges of the problem. AWR automatically preserves vital performance data to support thorough diagnosis over a period of days. and AWR can show which of these databases have the most remaining capacity. AWR can provide vital objective and quantitative data to help prioritize such fixes. all requiring limited developer and DBA resources to fix. but diagnoses might take a while. When this approach is used. AWR can make up for these deficiencies by providing a complete record of the full range of Oracle internal diagnostics. or unreliable. Some applications can use multiple databases. these goals can be adapted to other perspectives. Capacity planning The long-term view supported by AWR can help assess overall database capacity. the detailed performance data gathered automatically by AWR provides vital insight that is simply not visible when using only application-level metrics. Each of these goals calls for a slightly different set of AWR tables or query design. ensuring the most effective use of these limited human resources. This is usually the case even if external.  Find time ranges of load spikes Often. It is especially true if you are on a "fishing expedition". for example. missing. such as an alternative focus on segments. Although this outline focuses on an SQL perspective. Load testing It is often possible to organize load testing projects so that each test run fits within a set of AWR snapshots.     Find and drill down into recent load spikes A real-time monitor might detect a database problem in time to alert DBAs. with no knowledge of any particular problem or time . This paper presents example AWR queries for each of these different goals. thereby informing decisions about where to put new data or customers. Prioritize developer and DBA resources Many database applications have a backlog of performance problems. AWR can also be used to assess comparability of the various test runs: it can tell you. with each snapshot containing only one test run. Ways of using AWR An investigation into a database performance problem might involve a series of goals similar to the ones outlined below. application-level monitors have alerted you to the problem. Or perhaps monitoring takes place only at the application level. Real-time monitors might be inadequate It is not uncommon for real-time database monitors to be poor. if a given SQL statement executed the same number of times in each test.

One such method. o changes in execution plan. sort them. a very natural and obvious technique is to simply sum these up." is described below. Factors that can be easily seen include: o sudden spikes in execution rate. then look for statements with the highest sums. Find specific problem SQLs Once the time range of the performance problem has been determined. but occasionally create problems. the next step is to determine the specific SQL statements associated with the problem. One of the best tools for this goal is to examine the history of the Average Active Session (AAS) metric. For example. One reason that these time-related factors are so very useful is that they can tie internal database diagnostics to the applictaion itself. a statement might execute only at night. .  range. So a method of finding non-uniformities is needed. etc. etc. For example. o Active Session History (ASH) and its AWR version ASH provides a wealth of detail about active sessions and their problems. This paper does not go into detail about ASH. It records session-level details that are not always preserved by AWR. the use of statistical "variance. If these spikes are brief. a query might usually consume only a second or two of DB time per hour. they might not be visible through a review of gross aggregate data. o changes in performance characteristics such as consistent gets. and so these two Oracle features are complementary. o Non-uniform statistics Many of the most interesting and severe problems with database performance involve queries that normally execute well. These techniques can also be used as the starting point of an investigation. this is done with some consideration of time range. Several techniques are available as shown below. Characterize a problem SQL’s behavior over time It is often extraordinarily useful to examine a SQL statement's behavior over time. such as "since the deployment of the last code version" or "since addition of a new dataset". Usually. which can provide clues to its source within the aplplication. ASH does not include many of the performance statistics that are preserved by AWR. o changes in time spent in wait states. However. then suddenly take over the CPUs and cause loss of application functionality. o Sort by aggregated statistics Since AWR stores performance statistics aggregated per snapshot for each SQL statement.

The default AWR settings are modified using Oracle supplied package method DBMS_WORKLOAD_REPOSITORY. AWR will take snapshots once per hour. more space will be needed for longer retion periods. Of course. However. Also. DB time is defined as the sum of time spent by all sessions both on the CPU and stuck in non-idle wait states. This metric has been championed by Kyle Hailey. Storage needs will also probably vary with level of activity and application behavior. DB time can be thought of as the sum of time spent by all active sessions. Both retention and intervalarguments expect units of minutes. are much easier to spot when a full month of data are available. In my experience. . In turn. at the "top" of each hour. Example syntax is shown below.AWR settings By default. AWR will retain only a week's worth of snapshots. dbid IN NUMBER DEFAULT NULL). Oracle claims that about 300 megabytes are needed for a one-week retention of hourly snapshots.MODIFY_SNAPSHOT_SETTINGS(). and some of his material can be found at the links below in the reference section. I much prefer to retain a full month of data. In other words. topnsql IN NUMBER DEFAULT NULL. Find time ranges of load spikes: Average Active Sessions (AAS) AAS definition and usefulness The "Average Active Session" (AAS) metric is an extraordinarily simple and useful measure of the overall health of an Oracle database. Also. interval IN NUMBER DEFAULT NULL. some trends. storage needs increase along with snapshot frequency or length of retention.MODIFY_SNAPSHOT_SETTINGS( retention IN NUMBER DEFAULT NULL. or other changes. Most of us have workweeks that are very busy and filled with crises. See the Oracle documentation for more details. DBMS_WORKLOAD_REPOSITORY. The AAS metric can be defined as "DB time" divided by "elapsed time". so being able to save AWR data for more than a week is very important. by default. the hourly interval is appropriate.

The AAS metric can be calculated exactly. All methods. The statistical definition of skew actually is limited to one particular type . In fact. In this example. Average-based metrics always hide any non-uniformity of the underlying data. Once this threshold AAS value has been determined. Elapsed time denominator and non-uniformity (aka "skew") The AAS is inherently an aggregate. then database performance should be fine. let's say a database had four sessions that were active for the duration of a one-minute observation. this threshold AAS value should be determined empirically within the context of the application. effective use of the metric requires that we understand and detect non-uniformity. On the other hand. depend upon the choice the "elapsed time" denominator. it is easy to see how an AAS metric of four relates to having four active sessions. Each session contributes one minute to the sum of DB time in the numerator. as explained next. reducing the DB time metric is the main goal of Oracle's built-in ADDM self-tuning tool.For example. If we generalize the example to include more sessions that have various periods of active status. or averaged. this metric can serve as a very reliable. if the AAS metric is significantly less than the number of CPUs. giving a total DB time value of four minutes. This explains the name of the metric. as described below. Doing the division gives an AAS of four. the "elapsed time" denominator is one minute. the calculation of the AAS metric still gives a sense of the average number of active sessions during the observation period. or it can be estimated. If the AAS metric far exceeds the number of CPUs. then database performance will suffer. there was an average of four active sessions. Note: several Oracle authors use the term "skew" to refer to any type of nonuniformity. indicator of overall database performance problems. The threshold value of the AAS metric above which database performance suffers depends upon the application behavior and the expectations of the end users. The choice of the"elapsed time" denominator relates to issues of non-uniformity. metric. In this example. and readily available. In this trivial example. The AAS metric is most useful when compared to the number of CPUs available. Often. both using estimates or exact calculations. it is exactly this hidden non-uniformity that is most important for understanding an Oracle performance problem. Therefore. even though it is not explicitly defined based on counts or averages of sessions. Therefore.

rather that incremental statistics.g. and its cumulative value is available in V$SYSSTAT. since there are easier methods. A simpler exact calculation technique would be to use the DBA_HIST_SYSSTAT table. the default AWR recording frequency. A large elapsed time (e. which is the AWR version of V$SYSSTAT. Unfortunately. but at the expense of more visual clutter. Using a small elapsed time instead (e.. Since the AWR snapshots are one hour intervals. With this approach.sql. Conceivably. Robyn Sands recently presented a paper that outlines effective ways to detect skew (see reference section below). I show here some example output that illustrates the relationship between DB time and AAS. data volume. then divide by elapsed time. one hour) is convenient to use. hiding the skew that is usually of greatest interest. is to use a large elapsed time when possible for convenience. Oracle is doing the periodic recording of DB time for us. one could periodically record the DB time statistic in V$SYSSTAT. due to these existing precedents within the Oracle literature. This approach is easy to incorporate into estimates of AAS. as described in detail below. this is probably not worth the hassle. and is demonstrated below. but to also know when to focus in. the cumulative value is nearly worthless for most investigative purposes. calculate the difference between successive observations. An example query that calculates AAS exactly from DBA_HIST_SYSSTAT is available in the script section below as aas-exact. But we are still left with the hassle of calculating the difference between successive observations. The best approach. this paper also uses the term "skew" to refer to any type of non-uniformity. However. the choice of the "elapsed time" denominator is crucial to detection of skew. However. that leverage Oracle's built-in AWR snapshotting. Also. we can divide the incremental DB time by 3600 to obtain AAS. since this table stores cumulative.. As will be shown with examples below. is almost certainly too large for most AAS diagnostic purposes.of non-uniformity. SNAP_ID BEGIN_HOUR SECONDS_PER_HOUR AAS . Moreover. AAS exact calculation (not very useful) Oracle includes DB time as one of the statistics listed in V$STATNAME. etc. using small elapsed times to identify brief spikes.g. However. such large elapsed time intervals might mask brief spikes of vital interest. an hourly average might be the most appropriate for many applications or studies. She suggests calculating the ratio of the variance to the mean of the observations: a "high" ratio indicates skew. one hour intervals. to calculate the AAS metric. since there relatively few rows to look at or plot. one minute) might reveal such brief spikes. outlined next.

5 2008-07-09 15:00 31272 8. where the count of active sessions ranges from 2 to 12 per ASH observation. Each observation sample is identified by the integer column SAMPLE_ID. based on Active Session History .---------2008-07-09 06:00 3821 1.ASH) To clarify the method used to estimate the AAS metric from AWR data. 'ON CPU'. 1.2 2008-07-09 19:00 2518 .---------4196 4197 4198 4199 4200 4201 4202 4203 4204 4205 4206 4207 4208 4209 4210 ---------------. . This additional diagnostic detail is often helpful.(0.---------------.6 2008-07-09 08:00 76104 21. 0)) as on_cpu.1 2008-07-09 09:00 6435 1. The count of rows in this table for a given SAMPLE_ID is essentially the count of active sessions for that observation. sample_time. 0)) as waiting. column sample_time format a25 select sample_id. the logic and math are explained below incrementally.2 2008-07-09 13:00 14014 3.2 2008-07-09 11:00 7850 2.1 2008-07-09 18:00 4171 1.2 2008-07-09 12:00 11482 3.9 2008-07-09 14:00 8855 2. count(*) as active_sessions from v$active_session_history where -. Although not essential to estimating AAS.4 2008-07-09 17:00 28983 8.active session count per observation Oracle records key facts about active sessions about once per second.last 15 seconds sample_time > sysdate .25/1440) group by sample_id. sum(decode(session_state.1 2008-07-09 07:00 12839 3. An example is shown in the query results below. the query below also distinguishes between sessions in a wait state from those that think they are on the CPU. and maintains a historical buffer of its observations in a table called V$ACTIVE_SESSION_HISTORY. but is not necessary. Step 1 .7 2008-07-09 20:00 7044 2 AAS estimation methods (very useful.8 2008-07-09 10:00 15178 4. sum(decode(session_state.7 2008-07-09 16:00 4939 1. 1. 'WAITING'.

However.56.waiting). the resolution of SAMPLE_TIME select .04.17.205 PM 1 4 5 24-JUL-08 08.12.56. extremely rarely.08.105 PM 1 2 3 24-JUL-08 08.15.56.10.185 PM 1 2 3 24-JUL-08 08.125 PM 0 2 2 24-JUL-08 08. I also round the averages to a single decimal point. In this example. 253).---------. To compute the average number of active sessions.56.14.56.sample_time order by sample_id .07. even though we would not normally consider these to be "active" (I sometimes.13.135 PM 3 1 4 24-JUL-08 08.56.56. we can turn the "Step 1" query above into an inline view subquery.56.1) as wait_avg.average the number of active sessions over a time interval The "Step 1" query above provided session counts for every observation recorded by ASH in the time interval.56.active_sessions). Step 2 .---------.18.on_cpu). column sample_time format a19 select to_char(min(sub1.05. then wrap it in an outer query that does the averaging.56.235 PM 0 2 2 16 rows selected.225 PM 0 2 2 24-JUL-08 08.56.03.175 PM 0 4 4 24-JUL-08 08.1) as cpu_avg.215 PM 1 2 3 24-JUL-08 08. SAMPLE_ID ---------50667633 50667634 50667635 50667636 50667637 50667638 50667639 50667640 50667641 50667642 50667643 50667644 50667645 50667646 50667647 50667648 SAMPLE_TIME ON_CPU WAITING ACTIVE_SESSIONS ------------------------.1) as act_avg from ( -. sometimes ASH will record sessions that are in an "Idle" wait state. round(avg(sub1. see ASH sessions with the "Idle" event='virtual circuit status'). pp.sample_time).sub1: one row per second.56.06.085 PM 1 6 7 24-JUL-08 08. round(avg(sub1. neither of these very minor considerations impact the usefulness of this approach.--------------24-JUL-08 08.195 PM 1 3 4 24-JUL-08 08.11.165 PM 1 2 3 24-JUL-08 08. As an aside.56.09.115 PM 0 3 3 24-JUL-08 08. round(avg(sub1.56.095 PM 0 4 4 24-JUL-08 08. be aware that the ASH definition of an active session does not necessarily correspond exactly to the value of V$SESSION. and associate the averages with the earliest timestamp in the subquery.56. 'YYYY-MM-DD HH24:MI:SS') as sample_time.078 PM 3 9 12 24-JUL-08 08. Also.155 PM 0 4 4 24-JUL-08 08.16.STATE (Shee 2004.56.

Step 3 . count(*) as active_sessions from v$active_session_history where sample_time > sysdate .---------. during which it is unlikely that ASH will observe an idle database. 'ON CPU'. round( (variance(sub1.1) as wait_avg. sample_time. Obviously. typically many more active sessions than CPUs. However.active_sessions). .include variance divided by mean to find skew To the above "Step 3" query. sample_time..active_sessions)/avg(sub1. the averages thus calculated will be artificially too high for those intervals that include observations without active sessions (because the N=samples denominator for the avg() function is artificially too low).waiting).(0. 'YYYY-MM-DD HH24:MI:SS') as sample_time. round(variance(sub1. SAMPLE_TIME CPU_AVG WAIT_AVG ACT_AVG ------------------. A "high" ratio indicates skew. the resolution of SAMPLE_TIME select sample_id. Therefore. sum(decode(session_state. I have added both variance and the ratio of variance to mean. ASH does not record a row for an observation that found no active sessions (i.3 4.1) as act_var.---------2008-07-24 20:56:03 .1) as cpu_avg.1) as act_avg.on_cpu). 1.sample_id.1 1 row selected. As an aside. since periods of interest to us usually involve performance crises.1) as act_var_mean from ( -. round(avg(sub1.e.sub1: one row per second. This allows us to use the techniques championed by Robyn Sands to find skew (as described above).25/1440) group by sample_id. 'WAITING'.8 3.active_sessions)).---------. 0)) as waiting. round(avg(sub1.active_sessions). select to_char(min(sub1. 0)) as on_cpu. round(avg(sub1. 1. the averaging shown above does not include any data from ASH snapshots taken when the database was idle. sum(decode(session_state. most periods of interest involve plenty of active sessions.sample_time). suchSAMPLE_ID values are "missing" from ASH). sample_time ) sub1 . this is not a problem in practice.

Step 4 .25/1440) group by sample_id.sum(decode(session_state. This query is now identical to the final aas-permin.1) as act_var_mean from ( -. column sample_minute format a20 select to_char(round(sub1. 1.1 6. round( (variance(sub1. sample_time ) sub1 . Removed the variance column.1 1.---------.active_sessions). round(avg(sub1.(&minutes/1440) group by . I made the following changes to extend this approach to multiple. 1. sample_time.sql script below.1) as act_avg.3 4. 0)) as waiting.sub1: one row per second.active_sessions)). 0)) as on_cpu. 'ON CPU'.sample_time. sum(decode(session_state.8 3. Added a sqlplus substitution variable to specify the overall time range of interest. round(avg(sub1. 0)) as waiting. count(*) as active_sessions from v$active_session_history where sample_time > sysdate .---------. sequential time intervals.---------. since it is only interesting for debugging.1) as cpu_avg. the resolution of SAMPLE_TIME select sample_id.on_cpu). SAMPLE_TIME CPU_AVG WAIT_AVG ACT_AVG ACT_VAR ACT_VAR_MEAN ------------------. 0)) as on_cpu.sample_time.active_sessions)/avg(sub1.-----------2008-07-24 20:56:03 . 1. 'WAITING'. 1. 'ON CPU'.---------. 'YYYY-MM-DD HH24:MI:SS') as sample_minute. 'MI') as the basis of GROUP BY.    Used round(sub1. sum(decode(session_state. round(avg(sub1.waiting). count(*) as active_sessions from v$active_session_history where sample_time > sysdate .(0.1) as wait_avg.estimate for multiple time intervals (one minute resolution here) To the "Step 3" query above. 'WAITING'. 'MI'). sum(decode(session_state.5 1 row selected.

---------.5 1.active_sessions)/avg(sub1.7 3. old new 18: 18: sample_time > sysdate .9 3.4 4. 1.4 .sample_time.---------.3 2008-07-25 19:10:00 . .2 4. round(avg(sub1.6 3. Switched to DBA_HIST_ACTIVE_SESS_HISTORY to review older data.9 1.9 .4 4.8 2. 'WAITING'.8 3.4 .6 .active_sessions).3 2008-07-25 19:13:00 1 .on_cpu).4 2008-07-25 19:09:00 . column sample_hour format a17 select to_char(round(sub1. 'MI') . 1.6 .2 .9 2.-----------2008-07-25 19:05:00 1.8 2008-07-25 19:15:00 . sum(decode(session_state.1) as act_var_mean from ( -.(10/1440) SAMPLE_MINUTE CPU_AVG WAIT_AVG ACT_AVG ACT_VAR_MEAN -------------------.6 2008-07-25 19:14:00 .8 3.waiting). 0)) as waiting.9 . round(avg(sub1.5 .3 4.---------.1) as wait_avg. This query is now identical to the final aas-per-hour.4 2008-07-25 19:06:00 1 3. sample_time ) sub1 group by round(sub1. the resolution of SAMPLE_TIME select sample_id.sql script below. 'ON CPU'.4 2 . sample_time. I made the following changes to extend this approach to time intervals older than maintained by V$ACTIVE_SESSION_HISTORY. 'MI') order by round(sub1.sample_time.sample_id.3 2008-07-25 19:11:00 1 2.   Changed the rounding to be hourly rather than by minute. round( (variance(sub1.1) as act_avg.7 2.8 1.6 .3 2008-07-25 19:07:00 . 'HH24').3 2008-07-25 19:12:00 . round(avg(sub1.sub1: one row per second. Step 5 .6 1.1) as cpu_avg.6 4.2 2008-07-25 19:08:00 1 3.generalize to AWR for a longer history (hourly resolution here) To the "Step 4" query above. 0)) as on_cpu.sample_time.2 . 'YYYY-MM-DD HH24:MI') as sample_hour.(&minutes/1440) sample_time > sysdate .4 11 rows selected.active_sessions)). sum(decode(session_state.

Without considering the variance/mean. SQL> @aas-per-hour Enter value for hours: 12 old 18: sample_time > sysdate .---------. Example scenario Running the aas-per-hour. such a spike would have gone unnoticed.(&hours/24) group by sample_id.6 2008-04-16 08:00 1. it appears that this sampling method selects a subset of SAMPLE_ID values for archiving. Oracle seems to use a sampling method to extract some of the rows from V$ACTIVE_SESSION_HISTORY for longer-term storage in DBA_HIST_ACTIVE_SESS_HISTORY.5 2. Based on empirical evidence.---------.8 .(&hours/24) new 18: sample_time > sysdate .4). This indicates a large amount of variability over that hour.3 1 . perhaps a brief but intense spike in the AAS metric. such as selecting random rows fromV$ACTIVE_SESSION_HISTORY.count(*) as active_sessions from dba_hist_active_sess_history where sample_time > sysdate . 'HH24') order by round(sub1. If Oracle had implemented a different subsetting method.---------. then the count(*) based method here would break down.sql AWR script showed only low and moderate values of the hourly-average AAS metric (ACT_AVG column) over the previous twelve hours (from 1.4) at around 14:00.8 .sample_time.4 1. while obtaining all ASH observations for the chosen SAMPLE_ID values. As an aside.-----------2008-04-16 07:00 1. Experience had shown that application users would start to suffer only when the AAS metric was greater than about 20. So based on only this hourlyaverage AAS metric. the variance/mean value (ACT_VAR_MEAN) spiked very high (95. However. 'HH24') .8 to 6.sample_time.(12/24) SAMPLE_HOUR CPU_AVG WAIT_AVG ACT_AVG ACT_VAR_MEAN ----------------. sample_time ) sub1 group by round(sub1.4 . This is very fortuitous. it would seem that there had been no performance problems over the twelve hour period. since it preserves the ability to estimate active sessions by counting rows per SAMPLE_ID value (the basis of all the queries presented here).

6 2008-04-16 14:13:00 3 .2 .5 1.7 2008-04-16 13:58:00 6.3 .8 .5 2.6 .2 12.4 3 .3 8 1.3 170 in AAS 2008-04-16 14:09:00 6 1.8 6.1 3 2.sql AWR script showed that the AAS metric spiked to a very high value (130.7 2008-04-16 14:14:00 1.7 8.3 1.(&minutes/1440) new 18: SAMPLE_TIME > sysdate .7 2. The high value of the AAS metric indicated severe performance problems at that time.5 8.7 2.2 4.3 3.2 .6 .1 1 95.7 4.3 2.8 1. Running the aas-per-min-awr.7 2008-04-16 14:08:00 20.3 .8 3.(300/1440) SAMPLE_MINUTE CPU_AVG WAIT_AVG ACT_AVG ACT_VAR_MEAN -------------------.4 2.3) at around 14:08.6 <== spike Since the apparent peak in the AAS metric occurred at a time older than was still available in V$ACTIVE_SESSION_HISTORY.7 3.3 2.3 2.6 2.1 2008-04-16 14:12:00 5.2 2008-04-16 14:00:00 8.2 .6 1.7 .4 3.8 2008-04-16 14:11:00 4 .---------.2 2008-04-16 14:16:00 3.6 . but finding it here was very quick: just two simple queries.4 .5 2008-04-16 14:02:00 3.1 2008-04-16 14:06:00 6.1 .8 3.5 .3 7.7 .3 2008-04-16 14:05:00 8.2008-04-16 09:00 2008-04-16 10:00 2008-04-16 11:00 2008-04-16 12:00 2008-04-16 13:00 2008-04-16 14:00 in variance 2008-04-16 15:00 2008-04-16 16:00 2008-04-16 17:00 2008-04-16 18:00 2.2 .7 1.8 .3 2008-04-16 14:10:00 2.7 3.5 .2 2 5.5 .3 1 7.2 1.4 3.6 2008-04-16 13:59:00 3.4 2008-04-16 13:57:00 3.2 7.3 2.9 2.4 . but severe.3 4.1 2008-04-16 14:07:00 4.6 3.5 .6 2008-04-16 14:15:00 3.9 1.5 2008-04-16 14:03:00 2.3 3.8 4.5 109. SQL> aas-per-min-awr Enter value for minutes: 300 old 18: SAMPLE_TIME > sysdate .4 1. This brief problem might not have been visible without this approach.7 .3 1.6 .5 3.2 4.6 .8 .9 2.4 2008-04-16 14:17:00 3.3 4.---------.7 .6 . This knowledge of the time of the transient.9 <== spike .5 2008-04-16 14:04:00 3.3 10.3 1.2 3.8 130. spike enabled investigation and resolution of the problem.6 2.3 2 5.3 1. sure to impact the application end users.7 2008-04-16 13:56:00 4.8 2.2 .3 3.7 .5 1.-----------2008-04-16 13:54:00 3 1 4 0 2008-04-16 13:55:00 3.8 1.8 .7 1.6 2. the AWR historical data was necessary.7 2.3 2.1 2.8 3.9 6.6 4.---------.8 2008-04-16 14:01:00 10.

as well as time spent in wait events.3 . and is easy to join to DBA_HIST_SNAPSHOT to get time interval details.3 2.7 2. It contains an excellent variety of basic performance statistics such as execution rate and buffer gets. Once you have a time interval of interest (load spike.6 .8 .3 4 3. Example scenario Running the find-expensive.6 3.5 2.9 . or time ranges.6 1. sort orders.'YYYY-MM-DD') SQL_ID SECONDS_SINCE_DATE EXECS_SINCE_DATE GETS_SINCE_DATE ------------.5 .6 2.1 .5 1 .2 . while some others were relatively expensive with only a very few executions.7 1.5 .9 Find specific problem SQLs: Sort by aggregated statistics The AWR table DBA_HIST_SQLSTAT records aggregate performance statistics for each combination of SQL statement and execution plan. you can often find interesting or significant SQL statements by aggregating these statistics.2 5.8 .7 1..7 . etc.3 2.). new code deployment.8 2. load test.8 2.4 4.7 0 . and would probably benefit from a closer look.-----------------.5 7 .7 .7 1 1.2008-04-16 2008-04-16 2008-04-16 2008-04-16 2008-04-16 2008-04-16 2008-04-16 2008-04-16 2008-04-16 2008-04-16 2008-04-16 2008-04-16 2008-04-16 2008-04-16 .sql script while sorting by elapsed time gave the results shown below.'YYYY-MM-DD') new 16: begin_interval_time > to_date('2008-08-01'.---------------.8 1..2 . 14:18:00 14:19:00 14:20:00 14:21:00 14:22:00 14:23:00 14:24:00 14:25:00 14:26:00 14:27:00 14:28:00 14:29:00 14:30:00 14:31:00 1 2. then sorting to find the biggest contributors to resource consumption. It is "snapshot" based.3 .5 1 .--------------1wc4bx0qap8ph 30617 21563 284059357 6hudrj03d3w5g 23598 20551110 472673974 .sql Enter value for start_yyyymmdd: 2008-08-01 old 16: begin_interval_time > to_date('&&start_YYYYMMDD'.7 5.5 . as detailed in the "Selected AWR tables" section below.8 1.3 3.4 .6 5 3. These SQL statements were the largest consumers of DB time.8 1.2 . It is pretty easy to tell at a glance that some of these statements were big consumers of time due to high execution rate. The script is very easy to modify to include different metrics. SQL> @find-expensive.7 6.5 2.3 .8 1.5 2 2.

Notice how SQL_ID='g3176qdxahvv9' (third from the bottom) had only a moderate amount of elapsed time. Once you understand the general technique. Example scenario Running high-var-sql. such as DBA_HIST_SEG_STAT. this use of variance can be easily adapted to many other contexts. etc.50 . This statistic is usually easier to use when it is normalized by dividing it by the mean.sql Enter value for days_back: 7 old 17: snap. and the script below.sql over a week's work of data gave the following results.6tccf6u9tf891 2u874gr7qz2sk fpth08dw8pyr6 1jt5kjbg7fs5p 2f75gyksy99zn ccp5w0adc6xx9 gn26ddjqk93wc b6usrg82hwsa3 ctaajfgak033z 7zwhhjv42qmfq 96v8tzx8zb99s cxjsw79bprkm4 f2awk3951dcxv fzmzt8mf2sw16 01bqmm3gcy9yj 18731 15175 14553 11812 10805 5222 3568 2888 2391 2197 2152 1526 1500 1421 1329 33666 29014 2565 12451 21529 6167 114084711 2 4 592377 6167 396277 3462 311 299778 457970700 370715705 36018228 2004271887 567776447 222949142 248687700 165621244 66644336 31495833 117875813 137413869 35853709 44067742 23504806 Find specific problem SQLs: Non-uniform statistics As explained above. high-var-sql. Many database performance problems are related to skew: insight into problems.7 old 32: count(*) > ( &&days_back * 24) * 0. An excellent way to find skew is to use the statistical measure of non-uniformity called variance. This technique was previously described for finding load spikes. illustrates how it can be adapted to the SQL performance history in DBA_HIST_SQLSTAT.&&days_back new 17: snap. Short spikes in resource consumption often have severe impacts on application usability. problem with this query that was adversely impacting the application. although transient. but can go unnoticed in a review of aggregated data.BEGIN_INTERVAL_TIME > sysdate .sql.BEGIN_INTERVAL_TIME > sysdate . Subsequent investigation revealed a significant. and their solutions often require finding or recognizing skew. SQL> @high-var-sql. but a variance much higher that its mean (ratio of 383). aggregate statistics hide underlying skew. but would not have been noticed by looking only at aggregate performance statistics.

Once you have some suspect SQL statements to investigate.---------72wuyy9sxdmpx 41 7 167 bgpag6tkxt34h 29 12 167 crxfkabz8atgn 14 14 167 66uc7dydx131a 16 16 167 334d2t692js2z 36 19 167 6y7mxycfs7afs 23 20 167 36vs0kyfmr0qa 17 21 129 fp10bju9zh6qn 45 22 167 fas56fsc7j9u5 10 22 167 61dyrn8rjqva2 17 22 129 4f8wgv0d5hgua 31 23 167 7wvy5xpy0c6k5 15 23 151 8v59g9tn46y3p 17 24 132 9pw7ucw4n113r 59 27 167 41n1dhb0r3dhv 32 32 120 8mqxjr571bath 35 38 117 8jp67hs2296v3 46 154 128 afdjq1cf8dpwx 34 184 150 6n3h2sgxpr78g 454 198 145 g3176qdxahvv9 42 383 92 b72dmps6rp8z8 209 1116 167 6qv7az2048hk4 3409 50219 167 Characterize a problem SQL’s behavior over time The techniques above will help you find SQL statements that are associated with load spikes.6 hours per hour.000 seconds per hour (12. . high resource consumption. The data shown below was vital for resolution of this problem. or unstable performance. it is often easy to spot trends or patterns that point towards causes and solutions. This approach can also help identify parts ofthe application using the SQL. This was due to concurrency-related wait event pile-ups. Example 1 scenario The SQL statement with the time behavior shown below had sustained high execution rates. averaged over a whole hour!). as high as 44 times per second (158739 per hour). with over 45. at a steady four gets per execution.-------------------. and was used for the following examples. and these time-series data would have been hard to obtain without AWR.new 32: count(*) > ( 7 * 24) * 0. it would occasionally completely consume the CPUs. By using DBA_HIST_SQLSTAT to examine the time behavior of an SQL statement. However. It was very efficient.------------.sql is one way to spot these trends. it is often very helpful to review performance over time. The sql-stat-hist.50 SQL_ID AVG_SECONDS_PER_HOUR VAR_OVER_MEAN CT ------------.

SNAP_ID BEGIN_HOUR EXECS_PER_HOUR GETS_PER_HOUR GETS_PER_EXEC SECONDS_PER_HOUR ---------.-------------.--------------1978 2008-04-07 20:00 140449 540639 4 11 1979 2008-04-07 21:00 124142 477807 4 17 1980 2008-04-07 22:00 90568 347286 4 20 1981 2008-04-07 23:00 83287 323100 4 30 1982 2008-04-08 00:00 57094 221166 4 49 1983 2008-04-08 01:00 43925 170594 4 7 1984 2008-04-08 02:00 38596 150277 4 4 1985 2008-04-08 03:00 35710 139576 4 4 1986 2008-04-08 04:00 29700 115429 4 4 1987 2008-04-08 05:00 43666 170520 4 5 1988 2008-04-08 06:00 50755 197116 4 6 1989 2008-04-08 07:00 80371 310652 4 9 1990 2008-04-08 08:00 111924 431470 4 11 1991 2008-04-08 09:00 127154 489649 4 27 1992 2008-04-08 10:00 139270 536962 4 25 1993 2008-04-08 11:00 128697 496013 4 18 1994 2008-04-08 12:00 158739 613554 4 45287 1995 2008-04-08 13:00 152515 587605 4 40 1996 2008-04-08 14:00 144389 555770 4 37589 1997 2008-04-08 15:00 149278 575827 4 26 1998 2008-04-08 16:00 140632 542580 4 12 1999 2008-04-08 17:00 120113 462665 4 11 2000 2008-04-08 18:00 121394 468684 4 12 2001 2008-04-08 19:00 127948 493084 4 13 Example 2 scenario .---------------.------------.------------.

--------------1811 2008-03-31 21:00 98550 893916 9 28 1812 2008-03-31 22:00 9794 89386 9 2 1823 2008-04-01 09:00 1 1824 2008-04-01 10:00 1 1825 2008-04-01 11:00 1 1859 2008-04-02 21:00 3 1883 2008-04-03 21:00 22 1884 2008-04-03 22:00 38 1885 2008-04-03 23:00 39 1886 2008-04-04 00:00 36 1887 2008-04-04 01:00 27 1888 2008-04-04 02:00 24 1907 2008-04-04 21:00 20 1930 2008-04-05 20:00 3 1931 2008-04-05 21:00 30 1932 2008-04-05 22:00 40 1933 2008-04-05 23:00 32 1934 2008-04-06 00:00 31 1935 2008-04-06 01:00 28 207935 1871111 9 216352 1946824 9 230548 2079470 9 207867 1875544 9 196100 1768306 9 17437 158213 9 102162 923998 9 188449 1695215 9 190127 1710001 9 258505 2329526 9 276997 2500938 9 207334 1871430 9 79566 717500 9 17369 156840 9 3608 32759 9 4360 39362 9 3038 27604 9 .------------.-------------. As shown by the last column. SNAP_ID BEGIN_HOUR EXECS_PER_HOUR GETS_PER_HOUR GETS_PER_EXEC SECONDS_PER_HOUR ---------. these data pointed to a flaw in the application that needed fixing.------------. but it was not executed during the day.The SQL statement with the time behavior shown below had nightly high execution rates.---------------. the database seemed to be able to handle this high execution rate for this efficient query (all values well under 3600). Nevertheless.

-------------.------------.--------------1790 2008-03-31 00:00 6710 20340 3 0 1791 2008-03-31 01:00 83 253 3 0 1792 2008-03-31 02:00 18 54 3 0 1793 2008-03-31 03:00 18 54 3 0 1794 2008-03-31 04:00 1 3 3 0 1795 2008-03-31 05:00 16 48 3 0 1796 2008-03-31 06:00 1943358 5901783 3 85 1797 2008-03-31 07:00 5633 17195 3 0 1798 2008-03-31 08:00 927016 2815340 3 35 1799 2008-03-31 09:00 5843023 17744104 3 252 1800 2008-03-31 10:00 2929624 8896969 3 131 1801 2008-03-31 11:00 988709 3002649 3 45 1802 2008-03-31 12:00 1959757 5951342 3 108 1803 2008-03-31 13:00 10767 32728 3 1 1804 2008-03-31 14:00 997451 3028890 3 70 1805 2008-03-31 15:00 1000944 3039948 3 49 1806 2008-03-31 16:00 5166 15861 3 0 1807 2008-03-31 17:00 4821 14616 3 0 1808 2008-03-31 18:00 11639 35243 3 1 1809 2008-03-31 19:00 8346 25421 3 1 . these data pointed to a flaw in the application that needed fixing.------------. the database seemed to be able to handle this high execution rate for this efficient query (all values well under 3600). Nevertheless. As shown by the last column.---------------. SNAP_ID BEGIN_HOUR EXECS_PER_HOUR GETS_PER_HOUR GETS_PER_EXEC SECONDS_PER_HOUR ---------.1936 2008-04-06 02:00 15 118544 1065785 9 Example 3 scenario The SQL statement with the time behavior shown below had sporadically high execution rates.

the database was often struggling with this execution rate. during the hour of 2008-04-03 10:00 it was essentially consuming more than a whole CPU all by itself (4502 > 3600). SNAP_ID BEGIN_HOUR EXECS_PER_HOUR GETS_PER_HOUR GETS_PER_EXEC SECONDS_PER_HOUR ---------. Also. Again. with the plans having different efficiencies (primary key plan_hash_value is not shown here. but notice how the hour of 2008-04-02 23:00 has two rows). it would switch execution plans. As shown by the last column.------------.---------------.-------------. For example.--------------1848 2008-04-02 10:00 1028451 3155807 3 39 1849 2008-04-02 11:00 1015627 3116830 3 35 1850 2008-04-02 12:00 957525 2941788 3 34 1851 2008-04-02 13:00 7740 23486 3 0 1852 2008-04-02 14:00 2039987 6260065 3 86 1853 2008-04-02 15:00 1017857 3123548 3 33 1854 2008-04-02 16:00 3692 11286 3 0 1855 2008-04-02 17:00 8700 26482 3 0 1856 2008-04-02 18:00 5895 17937 3 0 1857 2008-04-02 19:00 7296 22103 3 0 1858 2008-04-02 20:00 2156 6526 3 0 1859 2008-04-02 21:00 2686 8186 3 0 1860 2008-04-02 22:00 5439 74432 14 14 1861 2008-04-02 23:00 227644 3152747 14 848 .1810 2008-03-31 20:00 1 1811 2008-03-31 21:00 160 1812 2008-03-31 22:00 3 1813 2008-03-31 23:00 0 4731 1975147 27361 521 14380 5998626 83023 1589 3 3 3 3 Example 4 scenario The SQL statement with the time behavior shown below had sporadically high execution rates. these AWR data were critical into characterizing this SQL statement's behavior so that a fix could be designed.------------.

.1861 2008-04-02 23:00 0 1862 2008-04-03 00:00 1215 1865 2008-04-03 03:00 1 1867 2008-04-03 05:00 0 1868 2008-04-03 06:00 0 1869 2008-04-03 07:00 1 1870 2008-04-03 08:00 190 1871 2008-04-03 09:00 1461 1872 2008-04-03 10:00 4503 1873 2008-04-03 11:00 0 80 792146 829 432 388 1273 28277 399722 1563634 232 283 7807033 7464 3889 2720 9142 804514 5372737 17540545 717 4 10 9 9 7 7 28 13 11 3 Selected AWR tables The various 10g databases I have seen all contained 79 AWR "tables" (i.------------------------NOT NULL NUMBER NOT NULL NUMBER NOT NULL NUMBER Name ------------------------------------------SNAP_ID DBID INSTANCE_NUMBER . rather than topics of more interest to the DBA such as undo segments. but SYSowned views with public synonyms. these are not really tables. non-RAC database) Null? Type -------. The focus here is application SQL performance diagnostics. etc. This paper discusses only a small fraction of the approximately 79 AWR tables..e. DBA_HIST_SNAPSHOT The DBA_HIST_SNAPSHOT table defines the time interval for each AWR snapshot (SNAP_ID).. Many of the underlying objects seem to have names starting with "WRH$_" and their segments seem to reside in the SYSAUX tablespace. Its effective primary key apparently includes these columns:   SNAP_ID DBID and INSTANCE_NUMBER (irrelevant for single. Of course. SGA. tables whose names begin with "DBA_HIST_"). . However. this paper is not a detailed look at the underlying structure of the AWR tables.

Type ------------------------NUMBER NUMBER NUMBER VARCHAR2(13) NUMBER VARCHAR2(64) VARCHAR2(64) VARCHAR2(30) NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER . and reads.. as well as wait times in classes of IO. CPU_TIME_DELTA .. application (in microseconds). DISK_READS_DELTA .... PARSING_SCHEMA_NAME . It also include CPU time and elapsed time.... NOT NULL TIMESTAMP(3) NOT NULL TIMESTAMP(3) DBA_HIST_SQLSTAT The DBA_HIST_SQLSTAT table records aggregate performance statistics for each SQL statement and execution plan.... IOWAIT_DELTA ... ELAPSED_TIME_DELTA .---------SNAP_ID DBID INSTANCE_NUMBER SQL_ID PLAN_HASH_VALUE . non-RAC database) It includes basic statistics such as executions. Name Null? ----------------------------------------..BEGIN_INTERVAL_TIME END_INTERVAL_TIME . CCWAIT_DELTA .. gets... APWAIT_DELTA . This is a very comprehensive set of statistics. MODULE ACTION . Its effective primary key apparently includes these columns:     SNAP_ID SQL_ID PLAN_HASH_VALUE (DBID and INSTANCE_NUMBER. irrelevant for single... BUFFER_GETS_DELTA ... CLWAIT_DELTA .. concurrency. EXECUTIONS_TOTAL EXECUTIONS_DELTA .....

----------. to_char(begin_interval_time. It includes almost 400 values of STAT_NAME. are much more useful. The incremental aggregates. unlike some other AWR tables. not incremental. since they can actually decrease. These cumulative statistic counters get reset with an Oracle bounce. since they allow you to calculate sums for specific snapshots. which complicates the calculation of deltas.----------2008-07-24 21:00 52647 52647 2008-07-24 22:00 63756 11109 2008-07-29 21:00 27602 27576 2008-07-31 21:00 77292 77280 2008-08-01 14:00 79548 2256 2008-08-01 15:00 109722 30174 2008-08-01 16:00 137217 27495 2008-08-01 17:00 155265 18048 2008-08-01 21:00 237432 82167 2008-08-04 09:00 97036 19744 2008-08-04 10:00 11232 11232 2008-08-04 21:00 2016 2016 DBA_HIST_SYSSTAT The DBA_HIST_SYSSTAT table records hourly snapshots of V$SYSSTAT. with names ending in DELTA.'YYYY-MM-DD HH24:MI') as begin_hour. so you need to calculate the deltas yourself. In fact. which complicates things. since parsing) and incremental (i. presumably if it aged out then brought back into the library cache. It includes only cumulative data. SNAP_ID ------4571 4572 4691 4739 4756 4757 4758 4759 4763 4823 4824 4835 BEGIN_HOUR EXECS_TOTAL EXECS_DELTA ---------------.e.e. . Its DB time values are in units of centiseconds...Most of these statistics are available in both cumulative (i. the TOTAL cumulative versions can be horribly misleading. for the snapshot only) aggregates. executions_total. The following example illustrates this severe problem with TOTAL versions: select snap_id. executions_delta from dba_hist_snapshot natural join dba_hist_sqlstat where sql_id = 'gk8sdttq18sxw' order by snap_id .

for example calculating the percentage of all DB time consumed by a particular query as a function of time. non-RAC database) Type ------------------------NUMBER NUMBER NUMBER NUMBER NUMBER Name Null? ----------------------------------------.---------SNAP_ID DBID INSTANCE_NUMBER STAT_ID STAT_NAME VALUE DBA_HIST_SEG_STAT The DBA_HIST_SEG_STAT table provides a very useful alternative perspective from usual SQL focus. This table includes basic statistics such as logical reads. In some cases.Many of its statistics can be used a basis for comparison. as well as wait counts such as buffer busy and row locks. the database objects themselves must be redesigned. Its effective primary key apparently includes these columns:    SNAP_ID STAT_ID DBID and INSTANCE_NUMBER (irrelevant for single. since SQL tuning can only go so far.---------SNAP_ID DBID INSTANCE_NUMBER TS# OBJ# . You should join to DBA_HIST_SEG_STAT_OBJ to get segment characteristics. non-RAC database) Type ------------------------NUMBER NUMBER NUMBER NUMBER VARCHAR2(64) NUMBER Name Null? ----------------------------------------. Both "delta" and "total" values are available: use the "delta" versions for easier aggregation within time intervals. and block changes. physical reads. Its effective primary key apparently includes these columns:     SNAP_ID OBJ# DATAOBJ# DBID and INSTANCE_NUMBER (irrelevant for single. This table can help you identify objects associated with the greatest resource consumption or with frequent occurrences of spikes.

... GC_BUFFER_BUSY_DELTA . TABLE_SCANS_DELTA NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER DBA_HIST_SEG_STAT_OBJ The DBA_HIST_SEG_STAT_OBJ table contains segment level details for objects tracked by DBA_HIST_SEG_STAT.. ROW_LOCK_WAITS_DELTA .. LOGICAL_READS_DELTA . type and tablespace name.. non-RAC database) Null? Type -------. DB_BLOCK_CHANGES_DELTA .... PHYSICAL_READS_DIRECT_DELTA .------------------------NOT NULL NUMBER NUMBER NOT NULL NUMBER NOT NULL NUMBER NOT NULL VARCHAR2(30) NOT NULL VARCHAR2(30) VARCHAR2(30) VARCHAR2(18) Name ------------------------------------------DBID TS# OBJ# DATAOBJ# OWNER OBJECT_NAME SUBOBJECT_NAME OBJECT_TYPE . PHYSICAL_READS_DELTA ..... owner.. Several segment types are included:    table and table partition index and index partition LOB Its effective primary key apparently includes these columns:    OBJ# DATAOBJ# DBID (irrelevant for single. PHYSICAL_WRITES_DELTA ... BUFFER_BUSY_WAITS_DELTA .. SPACE_USED_TOTAL SPACE_USED_DELTA SPACE_ALLOCATED_TOTAL SPACE_ALLOCATED_DELTA ... These details include name.DATAOBJ# ..

It is one of the few AWR tables that is not based on the AWR snapshots.sql (AWR) See also the AAS example above.on_cpu).TABLESPACE_NAME PARTITION_TYPE NOT NULL VARCHAR2(30) VARCHAR2(8) DBA_HIST_SQLTEXT The DBA_HIST_SQLTEXT table contains the full text of SQL statements for (nearly all) SQL_ID values included in other AWR tables. 'YYYY-MM-DD HH24:MI') as sample_hour. perhaps about one out of every ten ASH once-per-second samples is included in AWR. round(avg(sub1. and is therefore a part of ASH as well as AWR. A SQL statement can often be found here even when it is no longer in V$SQL and friends. Scripts Find time ranges of load spikes: Average Active Sessions (AAS) aas-per-hour. In other words. It is not uncommon to see resolution of about ten seconds. 'HH24').1) as cpu_avg.sample_time. non-RAC database) Null? Type -------. Its effective primary key apparently includes these columns:   SQL_ID DBID and INSTANCE_NUMBER (irrelevant for single. .------------------------NOT NULL NUMBER NOT NULL VARCHAR2(13) CLOB NUMBER Name ------------------------------------------DBID SQL_ID SQL_TEXT COMMAND_TYPE DBA_HIST_ACTIVE_SESS_HISTORY The DBA_HIST_ACTIVE_SESS_HISTORY table contains a subset of the active session data sampled about once per second in V$ACTIVE_SESSION_HISTORY. since it has a much smaller time resolution. column sample_hour format a16 select to_char(round(sub1.

round(avg(sub1.sample_time. 0)) as waiting.active_sessions)/avg(sub1. 'HH24') order by round(sub1. aas-per-min. round(avg(sub1. sample_time ) sub1 group by round(sub1.(&hours/24) group by sample_id.waiting). 1. round( (variance(sub1. 1.active_sessions)/avg(sub1.active_sessions)).sample_time. sample_time ) sub1 . the resolution of SAMPLE_TIME select sample_id. 1. 'WAITING'. sum(decode(session_state.1) as act_avg.(&minutes/1440) group by sample_id. the resolution of SAMPLE_TIME select sample_id.active_sessions).1) as wait_avg. sample_time.1) as cpu_avg.round(avg(sub1.1) as act_var_mean from ( -. sum(decode(session_state.active_sessions). 0)) as on_cpu.active_sessions)).on_cpu).1) as act_var_mean from ( -.waiting).1) as wait_avg. round( (variance(sub1. sum(decode(session_state.sub1: one row per second. 'ON CPU'.sub1: one row per second.sql (ASH) See also the AAS example above. sample_time. count(*) as active_sessions from dba_hist_active_sess_history where sample_time > sysdate . round(avg(sub1. column sample_minute format a16 select to_char(round(sub1. 1. 'MI'). 'ON CPU'. 'WAITING'. 0)) as on_cpu. 'YYYY-MM-DD HH24:MI') as sample_minute. sum(decode(session_state. count(*) as active_sessions from v$active_session_history where sample_time > sysdate .sample_time. 0)) as waiting. round(avg(sub1. 'HH24') .1) as act_avg.

1) as cpu_avg. sum(decode(session_state. hence divided by 3600 round( (stat_end. 0)) as on_cpu. -.VALUE round( (stat_end.sql (AWR) See also the AAS example above.stat_start.DB time is in units of centiseconds in DBA_HIST_SYSSTAT.'YYYY-MM-DD HH24:MI') as begin_hour. 'MI') order by round(sub1. round(avg(sub1. column sample_minute format a16 select to_char(round(sub1.sql (AWR) column BEGIN_HOUR format a16 select stat_start. 'ON CPU'.stat_start.waiting).active_sessions)).1) as wait_avg. 'MI') order by round(sub1.on_cpu).sample_time.1) as act_avg.value)/100 .active_sessions)/avg(sub1. 0)) as waiting. 0) as seconds_per_hour. sample_time ) sub1 group by round(sub1. 'YYYY-MM-DD HH24:MI') as sample_minute. aas-exact.sample_time.value .(&minutes/1440) group by sample_id.value)/(100*3600) .value . 1) as aas from dba_hist_sysstat stat_start. round(avg(sub1. round(avg(sub1. dba_hist_sysstat stat_end.begin_interval_time.snap_id.group by round(sub1.sample_time. round( (variance(sub1. to_char(snap. sample_time.sub1: one row per sampled ASH observation second select sample_id. sum(decode(session_state.1) as act_var_mean from ( -.active_sessions). 'MI') . 'MI'). .also assumes hourly snapshots. 'WAITING'. aas-per-min-awr. 'MI') . 1. 1. count(*) as active_sessions from dba_hist_active_sess_history where sample_time > sysdate .sample_time.sample_time. -.

sub to sort before rownum .one greater than the snap_id at teh start ofthe interval -stat_end.snap_id . For the order-by clause.dba_hist_snapshot snap where -.dbid = snap.(by time spent.instance_number and stat_end.gets_since_date from ( -.on exact matches of the remaining PK columns -( stat_end. sub.assumes the snap_id at the end of the interval is -.snap_id = stat_start.dbid and stat_end. but it is easy to use other metrics stored by the DBA_HIST_SQLSTAT table.execs_since_date. -. Find specific problem SQLs: Sort by aggregated statistics find-expensive.stat_name = 'DB time' and -.filter for the statistic we are interested in -stat_end.snap_id + 1 and -.seconds_since_date.instance_number = stat_start.gets most expensive queries -.join stat_start to snap on FK -( stat_start.otherwise. we join stat_end and stat_start -.sql_id.dbid = stat_start.stat_name = stat_start. change "order by" to use another metric) -. sub.dbid and stat_start.stat_name ) and -.sql (AWR) See also the aggregate example above. sub.snap_id and stat_start. This script looks at three metrics only.snap_id = snap.instance_number ) order by stat_start. I suggest using the numeric column position style so that it is easy to change interactively.after a specific date select sub.instance_number = snap.

sql (AWR) See also the high-variance example above.sql_id having -. count(*) as ct from ( -. round( variance(sub1. round(sum(elapsed_time_delta)/1000000) as seconds_since_date.must have executions to be interesting executions_delta > 0 ) sub1 group by sub1.assumes hourly snapshots too .seconds_per_hour)/avg(sub1. elapsed_time_delta/1000000 as seconds_per_hour from dba_hist_snapshot natural join dba_hist_sqlstat where -.sql undefine days_back select sub1. sum(buffer_gets_delta) as gets_since_date from dba_hist_snapshot natural join dba_hist_sqlstat where begin_interval_time > to_date('&&start_YYYYMMDD'. sum(executions_delta) as execs_since_date. -.select sql_id.high-var-sql.only queries that consume 10 seconds per hour on the average avg(sub1.sql_id.seconds_per_hour) ) as var_over_mean. Find specific problem SQLs: Non-uniform statistics high-var-sql. and -. round( avg(sub1.seconds_per_hour) > 10 and -. sql_id.&&days_back.only queries that run 50% of the time -.'YYYY-MM-DD') group by sql_id order by 2 desc ) sub where rownum < 30 .sub1 select snap_id.look at recent history only begin_interval_time > sysdate .seconds_per_hour) ) as avg_seconds_per_hour.

assumes that each AWR snap is one-hour (used in names. Prioritizing developer and DBA resources. if ever. but does not replace. undefine days_back Characterize a problem SQL's behavior over time sql-stat-hist. in a wide variety of contexts:       Databases that you rarely. and drilling down into. round(buffer_gets_delta/executions_delta) as gets_per_exec. . not math) column BEGIN_HOUR format a16 select snap_id. However. 'YYYY-MM-DD HH24:MI') and to_date('&end_hour'. executions_delta as execs_per_hour.sql (AWR) See also example scenarios 1. 'YYYY-MM-DD HH24:MI') and sql_id = '&sql_id' and executions_delta > 0 order by snap_id . Capacity planning. and Making up for deficiencies in real-time monitors. review. AWR tables provide many benefits that are not otherwise easy to obtain.'YYYY-MM-DD HH24:MI') as begin_hour. Finding. round(elapsed_time_delta/1000000) as seconds_per_hour from dba_hist_snapshot natural join dba_hist_sqlstat where begin_interval_time between to_date('&start_hour'. 3. This information complements. 2. real-time monitoring. recent load spikes. buffer_gets_delta as gets_per_hour.gets basic DBA_HIST_SQLSTAT data for a single sql_id -. Conclusion AWR enables study of historical database performance statistics.50 order by 3 . and 4 above. to_char(begin_interval_time. Load testing. -.count(*) > ( &&days_back * 24) * 0.

com/sym_speakers_hailey. The industrial/quality engineering concept of using variance to find skew is easy to incorporate into AWR projects. and encourage interactive exploration. References Kyle Hailey has championed the Average Active Session metric (AAS). well organized and clearly documented. These anomalies often point to limits of database scalability that need to be addressed. a System Design Architect with Cisco Systems.html Robyn Anderson Sands.hotsos.hotsos. such as very short spikes.ppt (MS Powerpoint) http://www.com/sym_speakers_sands.optimaldba. 2004 ISBN:007222729X .perfvision. which describes the usefulness of the ratio of variance to mean for finding skew:   http://www.com/ftp/hotsos/aas. This approach illuminates anomalies that might otherwise remain unnoticed. Kirtikumar Deshpande. K. wrote "An Industrial Engineer's Approach to Managing Oracle Databases".AWR tables are easy to use.pdf http://www. and some of his material can be found at the following links:   http://www. The tables are easy to join.html Oracle Wait Interface: A Practical Guide to Performance Diagnostics & Tuning By Richmond Shee. and their information is relevant.com/papers/IEDBMgmt. Gopalakrishnan Published 2004 McGraw-Hill Professional.

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->