Professional Documents
Culture Documents
What to collect?
– Basic OS info:
• Vmstat ;
• Ps –elf ;
• Svmon –G ;
• Iostat ;
– Dbm snapshot ;
– DB snapshot ;
– Dynamic SQL snapshot ;
– Db2pd –everything ;
– Application snapshots ;
• CPU Time ;
• User ;
• System ;
Time slice
Queues:
• Run;
• Blocked;
• Processes ;
• Kernel threads ;
• User threads ;
Blocked Queue ( 2nd column ) – Any number in this signifies that the
process is waiting for a resource.
CPU
– %usr – signifies how much percentage of CPU time is being spent in user code.
– %sys – signifies how much percentage of CPU time is being spent in kernel
code.
– %wio – signifies how much percentage of CPU time is being spent waiting on
disk/NFS I/O requests.
10
.
Some other vmstat options:
– On AIX, vmstat –I collects filepage I/O info as well,
– Also the vmstat –s option on all the platforms provides a summary info since
startup and can be looked into to find out if paging etc. happened before too.
11
ps command.
– Prints current processes running on the machine ;
– Shows CPU accumulated by different processes ;
– Useful to determine if DB2 engine is the highest CPU consumer ;
– For DPF allows to determine which partition consumes more CPU ;
12
ps aux
– Several times, with an interval, to see accumulated deltas ;
13
14
1 method:
– Execute application snapshot and look for EDU in the output ;
– We will get application id, current executing statement, row stats etc. ;
– Snapshot can be very slow during CPU bottleneck conditions ;
2 method:
– Db2pd –agents and grep for an EDU id identified from db2pd –edus ;
– We will get an application handle that agent belongs to ;
– Db2pd –applications and db2pd –dynamic will get us current executing
statement;
OR
– Db2pd –apinfo <apphdl> (new option) will get us the same info – similar to app
snapshot ;
15
16
17
Problem statement:
“CPU spikes happen several times a day, at different times and last
only a few minutes…”
Discussion:
How do we collect diagnostic data?
18
19
20
21
22
23
24
Problem Statement
Large brokerage system at the bank ;
Application server maintains persistent DB
connections ;
Database performance degrades dramatically
during active trading ;
Problem goes away if DB2 instance gets
restarted ;
Happens 2-3 times a day ;
27
No OS configuration changes ;
28
29
31
Database snapshot:
32
33
34
35
36
Customer’s new application had a new logic added to stored procedure that
was called from all applications ;
Under certain conditions this SP would raise user defined exception and
return it back to the client application.
Exception handling code on the client side would close the cursor but would
skip committing or rolling back the transaction.
38