Database

DEEP

DIVE INTO

ASH
Saibabu Devabhaktuni, PayPal

INTRODUCTION
Monitoring database performance activity in real time and performing root cause analysis for the past database problem has always been the number one challenge for any Oracle DBA. Since Oracle 7.3, Oracle has been steadily enhancing wait event interface to address this challenge. Perhaps the most important dynamic performance view that has real time performance related metrics is v$session and measuring active session count over time is one of the best load profile metric for any database. When Oracle introduced Active session history (ASH) feature in 10g, it changed the landscape of database performance monitoring. ASH samples active session information into circular buffer in SGA at one second interval, it mostly has same information as in v$session, and it is tightly integrated with automatic workload repository (AWR). It works even on physical standby Oracle 11g onwards. ASH require diagnostic package license to use it.
V$SESSION

It is important to understand more about v$session to realize the full potential of ASH. Wait event and blocking session information is also available in v$session. It has about 100 columns and it is highly recommended for DBA’s to understand these columns in detail. Although queries on v$session are extremely light weight, it is important to note that read consistency is not guaranteed for any queries against v$ views. Fixed_table_sequence is one of the most under used columns in v$session. It is incremented for every db call made by any session and its value being different across samples indicates session activity. This can also be used to determine order of session waits during any contention.

ASH

OVERVIEW

MMNL background process is responsible for capturing active session data at one second interval and write to circular buffer. AWR then write one out of every 10 samples from ASH to disk. All queries on ASH are processed from head to tail and hence most recently written data will be fetched first. ASH is integrated into database kernel and the overhead of running it is very low. Size of ASH circular buffer can vary from 2MB per CPU to up to 254MB. ASH is join of x$kewash and x$ash memory structures. X$kewash has one record for every snapshot of active sessions and it also has sample count and sample length. Sample address in x$kewash is used to locate snapshot address in x$ash. Is_awr_sample column is indexed in both the tables. Sample_id column is also indexed in x$ash. With the exception of few columns ASH is same as x$ash. Each AWR snapshot flush selected ASH sample data into DBA_HIST_ACTIVE_SESS_HISTORY table. ASH emergency flush will kick in if the circular buffer is two thirds full. ASH based snapshots are recorded in DBA_HIST_ASH_SNAPSHOT view. Just like awrrpt.sql, ashrpt.sql is provided under rdbms/admin directory and ADDM also rely on ASH data.
1 Session #

Database

ASH behavior can be influenced with parameters like _ash_enable for enabling/disabling ASH, _ash_disk_write_enable for awr writes, _ash_disk_filter_ratio for defining sampling of sample records to write to disk, _ash_eflush_triger to change emergency flush usage percentage, _ash_sample_all for sampling all sessions including inactive ones, and _ash_sampling_interval to change one second interval. Changing _ash_sampling_interval require instance restart and it can used during any snapshot standby testing or RAT.

ASH

SECTIONS

Columns in ASH can be divided into sample id and time, session details, sql details, pl/sql details, wait event information, blocking session details, client details, and time model information. SESSION DETAILS • IS_AWR_SAMPLE • SESSION_ID • SESSION_SERIAL# • SESSION_TYPE • USER_ID • SESSION_STATE • XID • REMOTE_INSTANCE# • PGA_ALLOCATED • TEMP_SPACE_ALLOCATED • TOP_LEVEL_CALL# • TOP_LEVEL_CALL_NAME VARCHAR2(1) NUMBER NUMBER VARCHAR2(10) NUMBER VARCHAR2(7) RAW(8) NUMBER NUMBER NUMBER NUMBER VARCHAR2(64)

AWR writes sample data when IS_AWR_SAMPLE is set. Session_id and serial# column combination can be used for comparing session data across samples. XID column indicate whether the session is in transaction. Top_level_call_name identify higher level db call being made by sessions, i.e. exec, fetch, commit, rollback, etc. SQL
DETAILS

• • • • • • • • • •

SQL_ID IS_SQLID_CURRENT SQL_CHILD_NUMBER SQL_OPCODE SQL_OPNAME TOP_LEVEL_SQL_ID TOP_LEVEL_SQL_OPCODE SQL_PLAN_HASH_VALUE SQL_PLAN_LINE_ID SQL_PLAN_OPERATION

VARCHAR2(13) VARCHAR2(1) NUMBER NUMBER VARCHAR2(64) VARCHAR2(13) NUMBER NUMBER NUMBER VARCHAR2(30)
2 Session #

Database

• • •

SQL_PLAN_OPTIONS SQL_EXEC_ID SQL_EXEC_START

VARCHAR2(30) NUMBER DATE

Sql_exec_id column is incremented for every execution of a given sql per instance and same value for it across multiple snapshots indicate session executing same sql. Sql plan information is not available in v$session, this information can be used for capturing sessions executing expensive sql statements. Top_level_sql_id can be used to determine calling sql in pl/sql or recursive statements. PL/SQL • • • • • • •
DETAILS

PLSQL_ENTRY_OBJECT_ID PLSQL_ENTRY_SUBPROGRAM_ID PLSQL_OBJECT_ID PLSQL_SUBPROGRAM_ID IN_PLSQL_EXECUTION IN_PLSQL_RPC IN_PLSQL_COMPILATION

NUMBER NUMBER NUMBER NUMBER VARCHAR2(1) VARCHAR2(1) VARCHAR2(1)

Calling Pl/sql object_id recorded in plsql_entry_object_id and its subprogram_id can be mapped to pl/sql object by querying dba_procedures view. Currently executing pl/sql object_id is recorded in plsql_object_id column. WAITEVENTS • EVENT • EVENT_ID • EVENT# • SEQ# • P1TEXT • P1 • P2TEXT • P2 • P3TEXT • P3 • WAIT_CLASS • WAIT_CLASS_ID • WAIT_TIME • TIME_WAITED

VARCHAR2(64) NUMBER NUMBER NUMBER VARCHAR2(64) NUMBER VARCHAR2(64) NUMBER VARCHAR2(64) NUMBER VARCHAR2(64) NUMBER NUMBER NUMBER

Event is only populated when session is waiting and it is left null if session is on CPU, but the p1, p2, p3 values are still populated for most recent event when session is on CPU. Wait_time value
3 Session #

Database

should only used to determine whether session is on CPU or not, session_state column can also be used for this purpose. Seq# is the sequence number of current wait event for a given session and its value is rolled over at 65535. Seq# value being same across multiple snapshots per a given event indicate session is waiting on the same event. Time_waited is only populated after session is done waiting for a given event, ASH does go back to the previous session sample to update time_waited to non zero value. Time_waited value will be left at zero if the session exits before ASH can capture the value for most recent event when the session was active. BLOCKING SESSION • BLOCKING_SESSION_STATUS • BLOCKING_SESSION • BLOCKING_SESSION_SERIAL# • BLOCKING_INST_ID • BLOCKING_HANGCHAIN_INFO • CURRENT_OBJ# • CURRENT_FILE# • CURRENT_BLOCK# • CURRENT_ROW#

VARCHAR2(11) NUMBER NUMBER NUMBER VARCHAR2(1) NUMBER NUMBER NUMBER NUMBER

Blocking session information is populated for any type of contention (including buffer busy waits, latch contention, library cache mutex, etc.) not just for locks/enqueues. Current object, file, block, and row can be used to determine contention at rowed level. Current object and file is populated for any type of physical I/O activity. CLIENT • • • • • • • •
DETAILS

SERVICE_HASH PROGRAM MODULE ACTION CLIENT_ID MACHINE PORT ECID

NUMBER VARCHAR2(48) VARCHAR2(64) VARCHAR2(64) VARCHAR2(64) VARCHAR2(64) NUMBER VARCHAR2(64)

Having many client connection details in ASH gives ability to report incident data by many dimensions. Client data can be used to report resource utilization by client program, module, service, machine, etc. This can also be used to troubleshoot any networking issue and having this data in awr base tables would facilitate trending middleware level changes for database capacity and scalability analysis. TIME
MODEL

4

Session #

Database

• • • • • • • • •

TM_DELTA_TIME TM_DELTA_CPU_TIME TM_DELTA_DB_TIME DELTA_TIME DELTA_READ_IO_REQUESTS DELTA_WRITE_IO_REQUESTS DELTA_READ_IO_BYTES DELTA_WRITE_IO_BYTES DELTA_INTERCONNECT_IO_BYTE

NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER

How much cpu and db time spent by a session over tm_delta_time is reported since the last sample time. Cpu and db time together can be greater than delta time due to double counting for some wait events like latch contention. This can be used to find top sessions by cpu or db time within a given interval. Incremental physical I/O requests and bytes processed reported since the last sampled delta_time. This is useful for determining top I/O bound sessions.

ASH

ENHANCEMENTS

Following new features are desirable but not yet available.  Hash of all bind values for a given sql execution (this is to identify contention caused sql with same binds)  Populating current_obj#/file#/block# for all logical reads when session is on cpu  Adding wait_time_micro from v$session and populating it just like in v$session across each sample  Populating last event when session is on cpu  Record redo usage per sql (ER# 8646714)  More awr samples for session outliers (ER# 8669416)  Reporting plsql line id being executed

DIY ASH
Do it yourself ASH is possible but it has some drawbacks. It can be achieved by copying contents of v$session for all active sessions to another table at second interval. Unlike ASH, it is not going to be light weight process. Sql plan and time model information will not be available. Even wait time metrics data will be incomplete. Overall, DIY ASH is a good alternate option if diagnostic package license is not purchased.

ASH

USE CASES

ASH can be used to perform real time troubleshooting of any database incident, root cause analysis of past database incident, proactive scalability analysis, fine grained resource utilization reports, and for finding effectiveness of pointing read only traffic to active dataguard. ASH can also be used for analyzing the impact of wait events causing most contention like buffer busy waits, latch free, and index contention.
5 Session #

Database

CONCLUSION
Prior to ASH, DBAs relied on v$session to sample the data or to get real time performance data, but with the introduction of ASH real time troubleshooting, RCA analysis, and scalability analysis has become much more streamlined. ASH is very well integrated with database kernel and awr, and it has very low overhead. Even though it requires additional license, benefits far outweigh the cost.

6

Session #