You are on page 1of 13

We'll do the analysis for you!

Thread Dump Analysis is a key tool for performance tuning and troubleshooting
of Java based applications. The current set of TDA tools (Samurai/TDA) dont
mine the thread dumps or provide a more detailed view of what each thread is
doing while just limiting themselves to reporting the state (locked/waiting/running)
or the lock information.

Most tools dont mention the type of activity within a thread, should it be treated
as normal or deserving a closer look? Can a pattern or anti-pattern be applied
against them? Any possible optimizations? Are there any hot spots? Any
classification of threads based on their execution cycles?

We decided to create ThreadLogic to address these deficiencies, by forking from


the existing open source TDA version 2.2 instead of reinventing the wheel,
leveraging the capabilities of TDA to parse the thread dumps and handle the UI.

Eric Gross build the support for JRockit (support was partial for JRockit in base
TDA v2.2) and IBM JVM Thread dumps. Sabha Parameswaran added analytics -
grouping of threads based on functionality and tagging of threads with advisories
using pre-defined rules and patterns which can be extended to handle additional
patterns. In-depth handling and analysis of WebLogic Server Thread dumps is
built into the tool.

We wish to thank Ingo Rockel, Robert Whitehurst and numerous others who had
contributed to the original TDA which allowed us build on their work in delivering
a more powerful tool for the entire Java community.

Once a thread dump is parsed and threads details are populated, each of the
thread is then analyzed against matching advisories and tagged appropriately.
The threads are also associated with specific Thread Groups based on
functionality or thread group name. Both the advisories and grouping are
managed via xml definition files which can be modified or extended.

Each of the advisory has a health level indicating severity of the issue found,
pattern, name, keyword and related advice.
Samples of advisories:

Thread Advisory Name WLS JMS Paging


Health Level FATAL
Keyword MessageHandle.setPagingInProgress
Description WebLogic JMS paging messages to disk
WLS has started paging messages to disk as consumers cannot keep
up with producers and messages have started accumulating;
Advice Increase, speed or tune consumers or Introduce flow controls/quotas
to slow down producers and Inflow rates. Or Increase number of
servers to spread the load.

Thread Advisory Name Web Application Bottleneck


Health Level WARNING
Keyword WebLayerBlocked
Description Web Application is waiting for an Event
Web Application should not go into WAIT state as it means the end
user would have to wait for indeterminate time for a synchronous
Advice
response, change the code or logic to return the results or response
right away instead of blocking or waiting for an event.

Each of the advisory gets triggered based on either call execution patterns
observed in the thread stack or presence of other conditions (thread blocked or
multiple threads blocked for same lock can trigger BlockedThreads Advisory).
Sometimes a thread might be tagged as IGNORE or NORMAL based on its
execution logic or might be tagged more specifically as involved in JMS send or
receive client or a Servlet thread. The advisories are generated based on an
advisory xml map that is extensible.

The health levels (in descending of severity) are FATAL (meant for Deadlocks,
STUCK, Finalizer blocked etc), WARNING, WATCH (worth watching), NORMAL
and IGNORE. Based on the highest severity of threads within a group, that
health level gets promoted to the Thread Group's health level and same is
repeated at the thread dump level. There can be multiple advisories tagged to a
Thread, Thread Group and Thread Dump.

<Advisory>
<Name>EOF Exception in socket read</Name>
<Health>WARNING</Health>
<Keyword>SocketMuxer.deliverEndOfStream</Keyword>
<Descrp>WLS Muxer got an abrupt End of Stream while reading from a Socket</Descrp>
<Advice>Check for connection disruptions between Server and Client (or other server
instances)</Advice>
</Advisory>
Snapshot of Advisory Map

Snapshot of Threads tagged with advisories in the thread dump

Threads in a thread dump tagged with Advisories/Health Levels


Thread Groups Summary

The threads are associated with thread groups based on the functionality or
thread names. Additional patterns exists to tag other threads (like iWay Adapter,
SAP, Tibco threads) and group them. The summary page reports on health level
of the group, total number of threads, threads that are blocked, critical advisories
found etc.

The grouping is managed by group definition xml files that specify pattern for
matching threads to specific groups. The grouping can be a simple group (match
a set of patterns) or complex (include some groups while exclude others). A set
of advisories can also be referred as ignorable or excluded for determining the
health of a thread or group.

<SimpleGroup>
<Name>Oracle Service Bus (OSB)</Name>
<Visible>true</Visible>
<Inclusion>true</Inclusion>
<MatchLocation>stack</MatchLocation>
<PatternList>
<Pattern>com.bea.wli.sb.transports</Pattern>
<Pattern>com.bea.wli.sb.pipeline</Pattern>
</PatternList>
</SimpleGroup>

<ComplexGroup>
<Name>Oracle AQ Adapter</Name>
<Visible>true</Visible>
<Inclusions>
<SimpleGroupId>Oracle AQ AdapterTemp</SimpleGroupId>
</Inclusions>
<Exclusions>
<SimpleGroupId>Oracle SOA DFW</SimpleGroupId>
</Exclusions>
<ExcludedAdvisories>
<AdvisoryId>Database Query Execution</AdvisoryId>
<AdvisoryId>Socket Read</AdvisoryId>
</ExcludedAdvisories>
</ComplexGroup>
Thread Groups Summary

Critical Advisories per thread group


The critical advisories (at Warning/Fatal health levels) found in individual threads
are then promoted to the parent thread group and reported in the thread group
summary page.

Critical Advisories for Thread Group


Thread Groups
One can see the thread groups are divided into two buckets - WLS and non-WLS
related threads. The JVM threads, LDAP and other unknown custom threads go
under the non-WLS bucket while all the WLS, Muxer, ADF, Coherence, Oracle,
SOA, JMS, Oracle Adapter threads are all under the WLS bucket. The
classification can be changed by modifying the GroupsDefn xml files.
Individual Thread tagging with Advisories

Clicking on the individual threads will display the advisories and thread stack.

Advisories and details at thread level

The details of the advisory will pop up on mouse over on the advisory links.
The Advisories are color coded and details can be highlighted.

Color coded advisories for individual threads

Sub-groups are also created within individual Thread Groups based on Warning
Levels, Hot call patterns (multiple threads executing same code section), threads
doing remote I/O (socket or db reads) etc.
Following snapshot shows example of a Hot call pattern where multiple threads
are executing the same code path (all are attempting to get lock a Queue
instance).

Hot Call Pattern - multiple threads exhibiting similar code execution


Dynamic Filtering based on Thread Health

Its also possible to just view a subset of threads based on health levels by using
the top level Minimum Health Level option.

Threads at IGNORE or higher health levels

Threads at FATAL health level


Merging of threads across multiple thread dumps and reporting of
progress in the thread state

Merge has been enhanced to report on the progress of the thread across the
thread dumps. Based on the order of the thread dumps, the thread stack traces
are compared for every consecutive thread dump.

Merged view showing progress information for individual threads


Merged reporting of individual thread stack traces (exists from base TDA version
2.2).

Merged Thread stack traces across thread dumps

Merging can also be done across multiple thread dump log files (like in case of
IBM which creates new log file containing the thread dump every time a request
is issued).
Usability benefits of ThreadLogic

Thanks to the advisories and health levels, its easy for users to quickly
understand the usage patterns, hot spots, thread groups, as well as highlight the
patterns or anti-patterns already implemented in the advisory list.
For example of an anti-pattern: a Servlet thread should not be waiting for an
event to occur as this will translate to bad performance for end user. Similarly
usage of synchronous jms consumers might be less performant compared to
using async consumers. Too many WLS Muxer threads is not advisable. If WLS
Muxer or Finalizer threads are blocked for unrelated locks, this will be a fatal
condition. It would be okay to ignore STUCK warning issued by WLS Server for
Pollers like AQ Adapter threads but not for other threads that are handling servlet
request.

The thread groups help in bunching together related threads; so SOA Suite users
can see how many BPEL Invoke and Engine threads are getting used, B2B users
can see number of JMS consumers/producers, WLS users can look at condition
and health of Muxer threads, similarly for JVM/Coherence/LDAP/other thread
groups.

The merged report lets the user see at a glance the critical threads and check if
they are progressing or not instead of wading through numerous threads and
associated thread dumps.

We hope this tool can really help both beginners and experts do their jobs more
quickly and efficiently when it comes to thread dumps.

You might also like