You are on page 1of 65





A nd m

any ot

All rights reserved. 2015 ZeroTurnaround Inc.

her ma


AN INTRODUCTION to Performance

Demystifying performance. It's all just data.

Java Monitoring Tools

Java Profilers


Performance Testing Tools


Looks great on paper, but does it work?


Performance Issues in Action


Moving Theory to Practice


The Paradox of Choice






w r
L i ofi le


All rights reserved. 2015 ZeroTurnaround Inc.



That is why you need to perfectly understand your application, its needs
and abilities, and feel comfortable using a range of performance related
tools:APMs, profilers, testing libraries that will help you solve
the issues you have.
When trying to pin down the top factors impacting application
performance, the right answer is that there is no right answer...
the source of a performance problem could be almost anywhere!
Research Director, Application Management,
Enterprise Management Associates (EMA)

All rights reserved. 2014 ZeroTurnaround Inc.

An Introduction to Performance
Java Performance tools are all very different and have typically been created
for different reasons and to achieve different goals. Which Java Performance
tools are you going to use in your next project and why would you choose
one over the other? There are many aspects which may sway your decision
and of course it will depend on the type of application youre building. This
report will cover Java Performance Considerations, Java Monitoring Tools,
Java Profilers, and Performance Testing Tools. We will also demonstrate a
few of our favorite Java Performance tools on a reference application to
help you get the answers you seek.
If you are not actively looking for answers to your performance issues
yet, consider this simple fact: performance affects your bottom line. The
performance of your system often directly translates into its utility for
the end user. Keeping your end users happy can have a major effect on
your bottom line. For instance, you may lose business if your e-commerce
site cannot handle the Black Friday loads, or if your business is high
performance trading, a delay of a couple of milliseconds could be the
difference between covering your body in gold leaf or simply just leaves.

All rights reserved. 2015 ZeroTurnaround Inc.

Google exposed the fundamental importance of performance to end users

when they revealed that people engage with web-pages more when they
load faster. More engagement means more conversions and returning
customers. And despite the fact that you can always compete with others
on better service, prices or whatever else you have to offer, not tapping into
the additional benefits that a faster system can offer is just not very smart.
Obviously, if youre a developer at a legacy software shop, you might not
bother yourself with such matters. You are proud of the work that you do
and the quality of the code you produce. Your end users are effectively the
operations team who have a very particular set of skills that sometimes
make them a nightmare for developers such as yourself. Your codes
performance can make the difference between them creating a developer
shrine for you, the ideal developer, or just printing out your profile picture
to be used as dartboard fodder.

It might seem like an unusual question in a report about performance
as most people tend to already have a good idea about what the term
performance means. However, its common that different people will have
a different perspective in the way they might describe it. For instance, some
may say being performant is doing the same task with fewer resources.
This might be by design, by choosing a more lightweight stack rather
than just increasing system hardware. Others may approach the topic
differently, by trying to eliminate bottlenecks, i.e. the part of the system
which is performing least well. Others might say increasing performance
is to eliminate unnecessary actions. The truth is, well, all of these are
performance related actions. Ultimately being performant is about
increasing user response times and reducing latency in all parts of your
system and as a whole, while being functionally accurate and consistent to
your end user. Now the question is how?
In this report, we will not take sides or focus on the scalability of the
system or try to make it run as fast as a caffeinated cheetah on a singlecore machine. Instead, well look at the different tools and techniques
that allow you to understand the balance between the resources your
system has and how it utilizes them: where does your system perform
most of the work and where should you look first if you need to tweak the

This is a great place to mention low-level code performance and

benchmarking. For instance, is x++ faster than x=x+1? How much

slower is an ArrayList at appending millions of items compared

to a LinkedList? This is a very code-centric approach and while
it has a place in the software engineering eco-system and is a very
fascinating area of developer growth, we will not focus on the
benchmarking tools or ways to solve low-level code performance
questions in this report. Sorry fans of tail-call-optimizations
and on-stack-replacements, well cover it another time. Today
were focusing on the high-level overview of system performance
and establishing the right balance between the resources and


Head of RebelLabs,
Content Warlock at ZeroTurnaround

In the next chapter well look at a list of resources you have to take into
account when talking about performance and define the functional
requirements of their utilization.

All rights reserved. 2015 ZeroTurnaround Inc.


In general, there are just a few types of resources that are key to the
performance of a runtime system. Typically, these are the main culprits:

CPU - computing power, CPU and GPU speeds, are they up to

the job for your system?

Memory - Cache, RAM, without enough system memory,

are you paging too much?

IO / Network - is your system writing data to disks or

creating remote connections?

Database - Are your database queries taking too long

to execute? And do you even know how many you
are creating?
Yes, we know database access could be categorized as IO, but the impact
database queries and database speed has to your system will immensely
affect the performance, and is very often one of the key IO bottlenecks so
were giving it its own category here.

The functional performance requirements to the system can be expressed

through the following requirement categories.
Throughput. How many concurrent users can the system
handle at once. For a web application that would be the number
of logged in users or the number of concurrent requests your
system can serve at once.
Responsiveness and Latency. How long does the end user
have to wait to get their request served? Firstly, Responsiveness
is how quickly a system begins to process a request and reacts
to an action of the user. Latency describes the time needed to
finish the processing of the request.
Scalability. How resilient is your system when the number
of concurrent users scale? What should it do when the real
world unleashes its powers onto your software (worked on
my machine) and you get more concurrent users than you
expected. Or perhaps your server just cant cope with the load
and goes down. Do you intentionally bring the system down,
serve everybody slower or throttle the throughput artificially?
Resource consumption. Most definitely, your system will not
only deal with just serving users. Youll need to think about
which activities or tasks in your system will clog the CPU and
influence the throughput other than the number of users.

All rights reserved. 2015 ZeroTurnaround Inc.

Heres a small translation of common terms that your manager might use and what they actually
mean in real performance terms:


What Managers say

What Managers mean





Startup time

CPU resource consumption

RAM footprint or garbage collection

Memory resource consumption

User-perceived performance

Responsiveness and latency

In general, almost any question about performance can be postulated in terms of the above
mentioned resources and requirements.

All rights reserved. 2015 ZeroTurnaround Inc.

Gather data, stare at it intensely, then go and fix performance
problems. Sounds easy enough. Lets dig into the variety of tools
that help you on this path.
When it comes to java performance tools that can help to optimize performance across
these areas, most fall into three major categories - Monitoring, Testing and Profiling. Java
Profiling and Monitoring both help measure and optimize performance during runtime.
Performance testing helps to show where your development efforts were not sympathetic
to real life, heavily loaded production environments. In this chapter well look into the
tools that are available today, their strengths, the features they offer and also how they
find the culprits of any performance issues.
All rights reserved. 2014 ZeroTurnaround Inc.

All rights reserved. 2015 ZeroTurnaround Inc.

Java Monitoring Tools

Application monitoring tools help answer the question: do we have any
problems with our deployment? If you think about it, its not a simple
question to answer. First of all, in a large enough environment, the law of
small numbers starts messing with you: even the most improbable events
happen all the time. Do you have a server on fire? Yes. Did the janitor
unplug the server to turn on the vacuum cleaner? Yes. You get the idea.
However, among all this chaos, you have to find real problems, or even
better predict the real problems before they occur.
The problem is compounded further beyond development, as production
environments usually have a complex mix of various services that are
carefully balanced to work together: databases, messaging queues,
enterprise service buses, application servers, multiple system components
working as one in a distributed and asynchronous manner. Monitoring
the health of just a single component is often not enough. It might provide
enough insight into what went wrong when the error has already occurred
and the system has suffered. But the true art of monitoring is to detect
the problems before they accumulate and come after you. Also note that
your environment is like a living ecosystem. If you change one area, dont
expectsurrounding services to act as they did before. Its very much a
change-monitor-evaluate-repeat process.

All rights reserved. 2015 ZeroTurnaround Inc.

Cost: $$$ [contact sales]

Dynatrace is currently the APM (Application Performance

Management) with the largest market share. For Java
applications, Dynatrace diagnoses and reports several types of
system events as well as resource consumption, including:
Memory and Thread Diagnostics

No gap
no bou s, no g ue
ss i ng,
- no pr

All rights reserved. 2015 ZeroTurnaround Inc.

Logging and Exception Analytics

Root Cause Identification of slowness
VM Health and Performance
Automated Transaction Discovery,

Mapping, and Monitoring

Database and Connection pool usage

Dynatrace automatically discovers Java

transactions and auto-models your application,
giving you a simple visual way to find out how
dependencies exist between JVMs, where time
is spent, and where problems exist. From this
high level picture of your system you can drilldown into method level details to see method
arguments, return values, SQL statements,
exceptions, log messages and so forth.
Combining the high-level overview with the
ability to show the smallest details of the code
gives you great power over the system.

All rights reserved. 2015 ZeroTurnaround Inc.

One of the Dynatraces unique aspects is that it

eliminates false positives and erroneous alerts
by setting notifications on absolute and relative
deviation from percentiles, rather than averages.
Smart baselining automatically interprets the
statistical characteristics of response times,
failure rates and throughput, and employs
advanced statistical models to analyze
application behavior. This substantially reduces
the cost of deploying and managing your
application in a complex enterprise environment.

If you have a heterogenous environment youll

need a complex tool to monitor it as a whole.
You cannot collect metrics in a small corner
of a big system and hope the data is relevant
to the performance of the whole system.
Dynatrace gives you a comprehensive view of
your entire system, so you can figure out if any
given bottleneck is worth resolving to increase
the throughput or not. If it's not the biggest
bottleneck in your environment, it shouldnt
be the first one on your list to fix, as it might
not even be there when you fix the biggest
problem. Cause and effect changes are hard to
predict without a lot of experience and careful


M o nito
transa r end-to-e
mi nute i o n perfor ma bus i ness
s, wit h
no over e wit hi n

All rights reserved. 2015 ZeroTurnaround Inc.


[contact sales]

AppDynamics is a company with a complex portfolio of

application monitoring software: Application Performance
management, Mobile Real User monitoring, Database monitoring,
Application analytics, etc. AppDynamics takes pride in being able
to handle the most complex deployment topologies and weird
heterogeneous setups, supporting a range of programming
languages like .Net, Python, Ruby, and Java.


AppDynamics has countless integrations with

Java EE and various frameworks like Spring,
Wicket and Struts to enable intelligent data
gathering. These frameworks provide an entry
point into your application, so AppDynamics
can monitor the transaction through your
whole system just by knowing the underlying
technology. Moreover, the AppDynamics agent
can discover the topology of your application
automagically and visualise it for you in a great
looking dashboard.

All in all, dozens of integrations and automatic

configuration with intelligent self-learning
algorithms make AppDynamics a very
interesting APM solution.
AppDynamics agents use machine learning
algorithms to auto-configure themselves for the
minimal overhead that they can impose on the
system without making it perceivably slower for
the end-users.

Additionally AppDynamics accounts for the

minimal overhead when monitoring the
average transactions in your system. But
when something extraordinary happens and a
business transaction takes longer, more data is
collected to provide additional information about
something you actually want to investigate.

CPU monitoring, slow transactions,

memory management with automatic memory
leak detection, integration and monitoring
with the background tasks libraries, almost
everything in the Java ecosystem is integrated
into their solution.
Additionally, AppDynamics provides standalone
agent monitoring that gathers hardware
information to monitor arbitrary machines
in the system such as load, free memory,
I/O and network stats. You can do all that with
the command line tools, but you can also have
this information easily integrated into the
common dashboard.

All rights reserved. 2015 ZeroTurnaround Inc.




NewRelic is an application monitoring solution that is incredibly

easy to install into your environment and immediately gathers
valuable information about your systems health. The monitoring
system consists of 2 parts, a javaagent that instruments your
application to collect data and the hosted service for analyzing
the data and presenting the reports.

Co nsta
applica ntly m o ni
t i o ns s
o you d ri ng your
o nt ha

All rights reserved. 2015 ZeroTurnaround Inc.


The NewRelic APM tool monitors application response times,

your most time consuming transactions and even measures the
performance of external services! The instrumentation captures
calls to out-of-process services such as web services, resources
in the cloud, and any other network calls. The external services
dashboard provides charts with your top five external services by
response time and external calls per minute.


New Relic also provides the following

functionality, as detailed on their website.
End-to-end transaction tracing - Follow the
performance of a critical transaction across your
entire service-oriented application environment.

Also, with access to the Performance Data API,

you can create customized queries on the data
it collects, including application server response
times, page load times, and the number of
transactions and page loads in your requests.

Also you can query the data about error rates

and application server performance. This
allows you to easily build custom dashboards
and enjoy your application monitoring the way
you want it.

Code-level visibility - Drill down to see the

performance impact of specific code segments
and SQL statements.
Key transactions - Flag your most critical
transactions to quickly spot when things like
response times, call counts, or error rates
perform poorly.
X-ray sessions - Gain deeper insight into a
key transaction's performance by showing
transaction traces alongside long-running
profiler results.
Lets cut through the marketing and understand
what this actually means. Well with X-Ray
Sessions, you can get deeper insight into the
performance of most valuable transactions by
collecting transaction traces alongside longrunning profiler results. So instead of showing
the aggregate information, the exact times of
a single transaction will be available to you.
This makes the analysis of the performance
data much easier than when operating on the
sampled values.
All rights reserved. 2015 ZeroTurnaround Inc.




Plumbr is a Java Performance Monitoring tool by the company of

the same name, which runs on your JVM process Plumbr monitors
things such as memory leaks, garbage collection inefficiencies
and locked threads. The product has evolved since its first launch,
from a dedicated tool that specifically targets memory leaks in a
JVM to an overall monitoring tool which you would use 24/7.

J ava P
The o n erfor manc
ly sol u
e M o ni
root ca i o n wit h au tori ng:
use det
ect i o n mat ic

All rights reserved. 2015 ZeroTurnaround Inc.

Plumbr runs as a javaagent on your JVM. An amount of

instrumentation is needed in order for Plumbr to be able to
create JVM bookkeeping of the objects created in the JVM (for its
memory leak detection), as this information is not exposed to java
agents by the JVM. This does incur a performance cost of around
a 20-25% heap and CPU overhead.


Lets look at the three areas which Plumbr

focuses on in more depth.
A memory leak occurs when a data structure
cannot be garbage collected, for some reason,
leading to an uncontrolled collection of data
structures filling up your heap and eventually
bringing your JVM down with a not so pleasant
OOM exception. Plumbr detects the signs of
data structures that should be cleaned up but
are left in the heap, alerts the user to this with a
root cause of the problem and proposes a fix.
Garbage collection inefficiencies are all about
the JVM pauses that happens while garbage
collection takes place. Imagine if when the
garbage truck arrives at your house, you werent
allowed to move until your trash and recyclables
were removed from your doorstep. This is
exactly what a JVM pause is like! The garbage
collector periodically looks through the heap
freeing up memory by removing data structures
and objects which are no longer required or
used by the runtime. When this occurs, the rest
of the JVM mostly twiddles its thumbs waiting
for it to complete. Plumbr detects unusually
large pauses and recommends config changes
to improve the garbage collection efficiency.

All rights reserved. 2015 ZeroTurnaround Inc.

Threads can be locked for a number of
reasons and is actually a good thing, as when
implemented successfully guarantees the
integrity and consistency of your data access.
They are however one of the most expensive
operations since hollywood made plastic surgery
a commodity. Plumbr detects which threads are
being locked, which locks are being contended
and can provide root causes for the lock itself.

The self defining unique feature of Plumbr is not

in the detection of the issues weve discussed,
but in the actions after detection. Plumbr pride
themselves on giving the user/developer the
necessary information they need to fix the
problems they find. In some cases they even
give the exact solution on how to change the
code to fix the problem.



G ood by
Perfor e J ava/J V

All rights reserved. 2015 ZeroTurnaround Inc.


Illuminate is a Java Performance tool that gathers analytics

based on machine learning. Illuminate focuses on the entire
environment, searching for bottlenecks that could be affecting
system performance. You are also able to enter your SLA data
into the tool, so that if you are ever in breach of any of your
agreements, Illuminate will be sure to tell you (Before someone
else does)!


The areas which Illuminate monitors are pretty
wide, meaning if the bottleneck does not
reside within your application, the tool is still
very useful in telling you where your problem
may exist, from heavy disk I/O to CPU context
switching. Illuminate is implemented as a
daemon on your server machine(s) that pass
detailed performance information back to an
aggregator which collates information and
makes it available via a dashboard UI.

The JClarity team, took their years of human

experience in performance combined it with
empirical data and used a variety of machine
learning approaches to determine root causes
and bottlenecks when provided with large
amounts of data in an overall system. This
style of machine learning well call Pepperdyne.
Its Skynets Cyberdyne crossed with JClaritys
performance expert, Kirk Pepperdine! With a
near infinite number of cause and effect style

issues that can be associated with changes to a

complex system, using the well defined space
which the machine learning data provides allows
illuminate to be more accurate and deterministic
with its identification of root causes.

Because of Illuminates breadth, its features

stretch beyond just the JVM. In fact, Illuminate
will give you information about high pause
times in your application due to garbage
collection in your JVM, OOM warnings if your
application is getting close to popping, heavy
disk I/O, RAM & swap space data, external
delays, such as DB access, 3rd party sources,
blocked, deadlocked or sleeping threads, CPU
context switching and hot (infinite) loops in
code. Wow, thats quite a lot, right! One of the
main and perhaps the unique feature is the
machine learning.

All rights reserved. 2015 ZeroTurnaround Inc.




All rights reserved. 2015 ZeroTurnaround Inc.


(FREE for development)

Java Mission Control is a Java performance monitoring tool by

Oracle which has been shipped with the JDK since Java version 7
update 40. It consists of two parts, aJMX Console and Java Flight
Recorder. JMX Console allows you go grab live information directly
from the runtime, providing you with the UI to query and change
certain aspects of the runtime. The Flight Recorder is more of a
history style tool. You can choose to run the flight recorder for an
amount of time, which will log data and metrics about the JVM.
After this, you can review the data from this period to drill down
and diagnose certain performance problems.



Java Mission Control works by interacting with

a JMX agent in the JVM which has an MBean
server that integrates with the built in VM and
app instrumentation running the in the JVM.
Thats a key advantage as it really does lower
the overhead cost of the tool as its using preexisting hooks. Oracle state this is normally well
below a 1% overhead.

Flight recorder, with its more historical view,

provides the ability to see trends in your JVM.
This gives you the data needed to find memory
leaks, latency issues around thread waits,
locking issues and more.

The most unique feature Java Mission Control

brings to the table is that its shipped with the
Oracle JDK. Theres nothing that you need to
install or attach to your existing VM to get it
working. With Flight recorder, there are a couple
of flags you need to enable, but nothing to
install. Start up a terminal go to your JDK bin
directory and just type jmc.

Java Mission Control has a simple configurable

dashboard in the JMX Console to view current
usage statistics for a great range of JVM
properties, from memory management to CPU
usage to garbage collection, threading and much
much more. Also, by using the JMX Console
you can set values simply by updating the
dashboard. This will invoke an MBean under the
covers. JMX Console also has an alerting feature
that provides a pop up when an attribute or
MBean value exceeds the value you set. These
triggers can be enabled or disabled as you
require. Theres also the ability to view threads,
including deadlocked threads, CPU profiling and
even a button that performs garbage collection
when you click it!

All rights reserved. 2015 ZeroTurnaround Inc.


Java Profilers
While application performance monitoring solutions focus on the high-level
picture of your heterogeneous production environments and mostly deal
with the question: are there any errors in the systems behaviour? Profilers
usually concentrate on a deeper aspect of the main performance questions.
i.e. what is actually happening in the system, under the covers?
High level understanding of system components is great for the
overview, but when you really need to optimize something, you need
to have the exact reports of where time is being spent, exactly what
is happening with code primitives such as threads, locks and memory
management components.
Naturally, you can often make code run faster by implementing a different,
superior algorithm. But, how do you know that its faster than before?
Gut instinct? Also, how can you be sure it will it stay as fast as youve made
it? Youll only know if you truly understand whats going on underneath all of
the abstractions and layers of business logic.
Code profilers gather intelligence about low-level code events in your
application and present this information in a useful, actionable way.
There are two main metrics a profiler can gather: counts and distributions.
Countable events are like number of times a thread locked on a certain
object or like the number of database queries that were executed during
a period of time. Both of these metrics have absolute meanings by
themselves. Distributions are more interesting, they can show where the
time is spent while executing a certain portion of code.

All rights reserved. 2015 ZeroTurnaround Inc.

There are two main approaches to profiling. Instrumentation

(or tracing) consists of adding monitoring code to the target
program to collect the execution information. Instrumentation
is a very precise way to receive the information about where the
application is spending time. It is however prone to observation
bias, as your code has been altered. Monitoring code also takes
time and the overhead can sometimes be quite observable for hotloops or for frequent and tiny method calls.
A sampling profiler, on the other hand,probes the target program
at regular time intervals. While this approach can seem less
precise than tracing the actual program executions the overhead
of the interruptions is typically much smaller. Also, since the result
shows a high level overview of the execution profile, it can often
uncover hidden system wide issues that might not have been clear
while instrumenting and following a single execution.


Head of RebelLabs,
Content Warlock at ZeroTurnaround




The YourKit profiler, by the company of the same name, is one

of the most established leaders in the Java profilers category.
As a mature and versatile profiler, YourKit can do both CPU and
memory profiling for you, with integrations across major Java
application servers, JDBC and other frameworks for high-level
performance analysis like finding synchronisation issues and
excessive database access.

The I nd
i n .N E ustr y L ea
T & Ja
va Pro der
fi li ng

All rights reserved. 2015 ZeroTurnaround Inc.

YourKit makes the experience of finding performance problems

easier and more straightforward. But YourKit, as a fully featured
profiler naturally allows you to dig into the heaps of data it collects
to pinpoint the locations of necessary performance optimizations.
YourKit can run in both sampling and tracing profiling modes
and the mixed approach helps it make the most of both worlds:
the precision of tracing the actual code execution, while being
able to precisely control the profiling overhead.


The memory profiling in YourKit can detect

memory leaks and trace the excessive objects
back to the GC roots to show you why the
objects are not being collected. A memory
snapshot comparison and automatic memory
snapshot generation when memory is low can
help further analyze your applications heap.
Exception telemetry views allows you to view
the exceptions that occurred during a run of the
application under profile. Naturally you can filter
and group them by class or origin in the code,
but YourKit also has the capabilities to save a
snapshot of this information and compare with
a previous snapshot.

The recent version of YourKit profiler also

includes two lightweight profiling abilities: the
lightweight CPU profiling and the lightweight
memory profiling as they are called. In a
nutshell, lightweight here means counting.
When the lightweight CPU profiling mode
is enabled, the profiler will count method
invocations. The lightweight memory profiling
mode will count the number of objects created.
This approach loses some important data, like
the ability to sort statistics by thread or provide
the exact stack traces of object creation or
method invocations.

However, the major benefit of such modes is the

negligible overhead of the code instrumentation:
it doesnt do anything heavy, so the impact
is reduced compared to the normal profiling
modes. And despite the common understanding
that a profiler has to provide tons of precise
data to be useful, you can definitely benefit from
the hints the lightweight modes give you. The
most obvious algorithmic bottlenecks or places
that unnecessary create excessive objects will be
easily noticed.

A high-level overview of your applications

performance is nicely shown in one place, with
information about the JSP/Servlet, Database
connections and queries, sockets info and file
I/O operations. The number of bytes written, the
timings of the queries and JSP processing can
give you a rough mental image of what is your
Java app doing in general.
With the unique on-demand profiling, you can
run the profiled application with approximately
zero overhead, activating actual profiling
only when it is needed. When the profilers
capabilities are needed you can always turn the
YourKit profiler on and you can precisely control
the overhead that youre able to tolerate.

All rights reserved. 2015 ZeroTurnaround Inc.




The A w
A ll-i n- ard-W i n
ni ng
O ne J a
va Pro
fi ler

All rights reserved. 2015 ZeroTurnaround Inc.


JProfiler, by ej-technologies GmbH, is one of the toolsthat

comprises their performance oriented tools portfolio. They also
have an APM solution, but in this section we want to focus on
their profiler. JProfiler is a comprehensive profiler for Java SE and
Java EE applications with plugins for all major IDEs which provides
enhanced analysis of the collected profile data. The main approach
of the JProfiler analysis is to collect and record profiling session
data: metrics and the information about the application behavior.
As with any profiler, the CPU profiling is perhaps the most
important and useful thing you can get from JProfiler. Besides
collecting the profile session locally as well as from a remote
process, JProfiler offers live profiling capabilities, in which it
displays its most current information about the application
performance behavior. Similarly to YourKit, JProfiler also allows
you to use both the sampling approach as well as call tracing.


JProfiler can show a call graph view, where the
methods are represented by colored rectangles
that provide instant visual feedback about
where the slow code resides in the method call
chains, making bottlenecks easier to find.

java.util.concurrent and tie this back to what

Additionally, it can also profile calls across

was happening in the background threads on

the caller. JProfiler can handle and process

multiple JVMs by tracking the execution

times of RMI, EJB calls and the consumption
of web-services.

Thread.start() events, common UI libraries

like SWT and AWT that defer UI handling to a
separate thread.

Memory profiling with JProfiler can also be

tuned to get more or less detail, depending
on if you want to getmore data or reduce the
performance overhead. It can collect, analyze
and render snapshots of the heap created
with HPROF. Consequently it can help with
understanding and pinpointing issues in the JVM
generated files created when an OOM exception
brings your JVM down. You of course need to
enable -XX:+HeapDumpOnOutOfMemoryError

on your JVM to get these generated files. You do

have that enabled, right?
The memory profiling statistics will show the
usual Garbage Collection data, frequencies,
timings, and so forth and will also provide a
visual output of the call trees that allocate most
objects, allocation hot spots, the largest objects
on the heap and the full object graph.
To pick one of the most interesting features
in JProfiler, wed have to go for request
tracking. The request tracking feature allows
instrumenting certain asynchronous execution
frameworks like the executors from

All rights reserved. 2015 ZeroTurnaround Inc.




XRebel by ZeroTurnaround is a lightweight Java profiler that is

intended for use in a development environment. Its a javaagent
that instruments Java web-applications and automatically injects
a widget like reporting console, embedded in your application
view. The widget shows the application performance data. The
main benefit comes from the fact that the developer can fix
nastiest performance issues without even committing the poor
code into the build.

The L i


ght J a
va Pro
fi ler

All rights reserved. 2015 ZeroTurnaround Inc.


XRebel gathers and nicely presents the time

spent serving every request, broken down into
relevant method calls. This application trace
contains relative information about both selftime and total time of the methods together with
an intelligent way of presenting the information
that makes obvious which methods affect the
performance the most. By only showing relevant
method information, this approach scales with
your applications.
Additionally, XRebel has tight integrations with
database drivers and common HTTP querying
solutions, and can gather and display the
database activity originated in the application
and requests to third-party web-services.
Database access and HTTP calls are the most
common reasons for poor performance and
XRebel shines at presenting the developer
with accessible data about them. This makes
avoiding certain performance issues easy.

All rights reserved. 2015 ZeroTurnaround Inc.

XRebel finds excessive database access, easily

uncovers the N+1 queries problems while still
in development. It understands which objects
clog the memory in the HTTP session, again
as they happen during development. Also, it
allows developers to instantly see all exceptions
happening in the application, even if they are
not propagated to the UI properly, aka, the
hidden exception!

While XRebel is not the only profiler that injects

itself into the application views, it is the most
advanced and polished solution with the
intuitive UI that tackles performance issues
at their creation time, in the development
environment. The main benefit of XRebel
is that it is a developers tool, rather than a
production monitoring solution. Being closer to
the author of the code means the information
about application performance is more valuable
and gets acted upon faster, when issues are
cheapest to fix.





Honest profiler is an open source profiler that was created as an

exercise overcome the inherited biases of the sampling approach
to profiling.


All rights reserved. 2015 ZeroTurnaround Inc.

The Honest profiler has two parts to it. A C++ jvmti agent that
writes out a file containing all the profiling information about the
application which the jvmti agent was attached to. Did you shudder
when we mentioned a C++ jvmti? It may mean your transition
to the JVM darkside is now complete. Congratulations Darth
Developer, you may continue to sneer at C++ rebel code! Veering
back to the plot The second part, is a Java application that
renders a profile based on this log that was previously generated.


Honest Profiler gets around the problem
of being biased towards collecting sample
information at JVM safepoints by having its
own sampling agent that uses UNIX Operating
System signals.

This additional accuracy is required when the

performance profile has to be incredibly exact.
You might want to look at other solutions first
and reserve use of the Honest Profiler for those
cases when youre explicitly aware or get that
hunch that other sampling profilers are not
accurate enough.

Two of the main benefits which Honest Profiler

has over other sampling profilers on the
JVM include:

It profiles applications more accurately,

avoiding an inherent bias towards places
that have safepoints.

It profiles applications with significantly

lower overhead than traditional profiling

techniques, making it more suitable for
use in production.
Honest Profiler relies on an internal API within
the SUN/Oracle/OpenJDK JVM. There are no
guarantees it will work on other JVMs. Ultimately,
Honest profiler is like any other sampling
profiler, just more accurate, since it doesnt
have the inherited bias of sampling at the JVM
safepoints instead of being totally random.

All rights reserved. 2015 ZeroTurnaround Inc.


Performance Testing Tools

Application monitoring tools answer the questions: is anything is wrong with
my application deployment? Are there any particular areas that are slower
than expected? And, is therea possible downtime coming your way because
some resource is consumed upto the limit?
A profiler usually provides the low level insight and shows which pieces of
functionality take the most time. This allows you to pinpoint exactly which
piece of code is responsible for the performance decrease and rewrite it, or
give the appropriate developer the appropriate hat to wear, depending on
your office dynamic.
Now, the question is: how do we know the new solution is indeed better
and what can we do to avoid introducing regressions later? The answer to
both parts of this question is pretty straightforward, we need a baseline.
Some recording of the behavior of the application that can be compared
to another recording that shows us whether we are above or below the
established threshold.

Lets draw a parallel between what a profiler and a performance testing

tool actually do. If we perceive a profiler run as a unit test for an piece of
code, a performance testing tool will be more like an integration test. It can
generate the required load to an application to emulate production-like
load. And be very aware, thats an extremely tough thing to emulate, and
even the best load tools will struggle with this task. Performance testing
tools dont concern themselves with the pesky details of what is actually
happening inside the application, instead they focus on how the application
is performing to the outside world. They record the request/response
loading times as an end-user would perceive it. Basically how many times
do they wiggle their mouse while waiting for their page to load! We can call
this the wiggle-coefficient.
In this section we look at two established performance test libraries that
can help you to generate the load for your application in a secure test
environment and provide the reports on the application's performance so
you can establish a baseline and tweak the performance without worrying
about regressions.

The general consensus on application performance is that the closer the

test environment is to the production systems setup, the more accurate the
results of the performance tests are. Which means that the system should
be adequately loaded when the performance result is recorded.

All rights reserved. 2015 ZeroTurnaround Inc.





Apache JMeter is an open source Java application for loading test

functional behaviour and measuring performance.
Apache JMeter may be used to test performance both on static
and dynamic resources. It can be used to simulate a heavy load
on a server, or a cluster of servers, to test their strength or to
analyse overall performance under different load types.

All rights reserved. 2015 ZeroTurnaround Inc.


You can use JMeter to create a graphical
analysis of the performance of your application
or to test your server behaviour under heavy
concurrent load. You wont replicate your
actual browser with JMeter, it wont evaluate
the JavaScript on your pages, so it might not
suit your needs, but it is one of the de-facto
standard solutions for performance tests in the
Java world, so you ought to know how to use it.

All rights reserved. 2015 ZeroTurnaround Inc.





Gatling is an open source load testing framework based on Scala,

Akka and Netty with extremely beautiful HTTP reports.

A rm y

oursel f

for Pe

Gatling can record test scenarios and has a developer-friendly

DSL, so you can easily extend your recorded tests.

All rights reserved. 2015 ZeroTurnaround Inc.

for man


Gatling is an extremely usable open source load
testing framework. There are three main factors
that contribute to its success: the quality of the
reports that Gatling produces out of the box is
much higher than one might expect. They are
interactive and look good; the tests make use of a
simple DSL which is, spoiler alert, written in Scala;
and the fact that Gatling was designed with realworld load generation as a goal, it was created
with highly concurrent test scenarios in mind.

All rights reserved. 2015 ZeroTurnaround Inc.

The combination of these three aspects, make

Gatling tests easy to create and maintain.
Theyre also easy to scale to your specific load
requirements and their results are very easy on
the eyes.

Also, just like with JMeter you can set up a

recording proxy, and use your application via a
browser to record the tests, which can later be
enhanced and extended. So the learning curve
to get started with Gatling is extremely shallow.



An experienced developer can find a reason for their code being
slow, pretty quickly. It does take time to fix the issue and more
often than not the results show the initial guess was wrong.
Trying to optimize or improve performance without proper tools
and measurements is crazy talk!

I n t heo
betwee y t here s no
n prac
I n prac ice and t h nce
eor y.
t ice t h
ere is.
All rights reserved. 2015 ZeroTurnaround Inc.


Performance Issues in Action

So, your application is slow. Maybe you experienced that firsthand or
maybe your project manager came to you saying: Users are complaining
that its slow again. Perhaps your operations team are pointing fingers at
you, giggling that you are unable to deliver a single performing artifact to
them. Irrespective of the informations source, you are at the edge of the
problem and the fix. You have to now determine whats going on and then
figure out how to fight the fire.

Well, there are several ways to fight application performance issues, and
they even share a crucial common trait. They all assume that you operate
on hard data and know what youre doing. While this may be very accurate,
you must take this next piece of advice very seriously
Performance issues cannot be solved by shooting from the hip.
Measure, apply a fix, measure again!

At the same time, you can be sure that just fixing this issue once and
forever is not an option available to you. Youll have the same problem
after the next release, then again and again. What you really need is a
change of perspective. How should you treat performance issues and the
performance of your application in general?

All rights reserved. 2015 ZeroTurnaround Inc.


Moving from Theory to Practice

The machine that we ran the sample application on is a fairly common

developers box:

The first thing we always need to do is figure out which resource is limited
in the application, you know, the bottleneck. There can be only one source
of every bottleneck, even though you might predict that your applications
are CPU bound or memory bound, or perform too much IO for every action.
There are simple actions that you can take to determine the culprit.
In this section, well look at a selection of tools that we discussed before on
a sample application running locally. This would be a somewhat unusual
setup for showcasing monitoring tools, which usually shine in the more
complex environments, but since almost all of us are developers here we
really like to run things locally.
Our sample application is Confluence. We'll use this application toshow
you how to build and run simple JMeter performance tests. Then well
obtain general information about the performance of the system, digging
into the performance of a single page, analyzing why it takes so much time
to load, with YourKit and also showing how you can setup and run XRebel
to profile the application to find the most outrageous performance issues
during development time. Does that sound exciting? Good, it should do,
because it is!

We chose Atlassian Confluence as our reference application for
performance testing. Note, that this is not an application code optimisation
exercise, in fact we do not have access to the Confluence code base.
Instead we simply configured and ran a sample set of profiling tools and a
performance testing tool that we looked at earlier in this report.

All rights reserved. 2015 ZeroTurnaround Inc.

Yeah, we know, it could be a more powerful setup or a more complicated

deployment, but the intention of this section is to show some performance
tools in practice, give the basic examples on how to configure them and
some general considerations when profiling an application. The hardware
isnt so important here, but rather illustrating the tools and their usage.
For those who arent familiar with it, Confluence is a wiki collaboration
engine that enables teams to take meeting notes, discuss uploaded files,
assign tasks and manage projects. To obtain a copy of it, you can proceed
to the Atlassian Confluence download page and get yourself a free trial.
First of all, we need to create some application data and state so that when
we run our performance tests, wed actually gather some interesting profile
data. To do this, we need to simulate some realistic load, so we start by
creating a simple JMeter test that interacts with a couple of Confluence pages.

Youll need to download JMeter from its Apache
project page. You can also include it as a Maven
dependency from Maven central, if you want to
include it programmatically.
Extracting the archive creates a directory, which
will be the home of all our JMeter experiments.
JMeter itself is a Java program, so you can easily
access it programmatically if you need or reuse
your prior knowledge of configuring various Java
JMeter happens to be quite memory intensive,
ironically, but thats understandable since its
processing multiple results concurrently to
simulate sample load. So before starting the
JMeter GUI, configure the JMeter JVM to have a
slightly larger heap size than it has available by
default, as shown.


$ JVM_ARGS="-Xms1024m -Xmx1024m" bin/


When you run this, a JMeter window will appear.

Use the second button from the left, the green
one, in the top toolbar to create a new test
configuration using a template.

All rights reserved. 2015 ZeroTurnaround Inc.


The UI of JMeter GUI is not the slickest and it contains a number of options that might confuse a newcomer
to the tool, but to begin we just need to configure a couple of basic elements, including the Thread group
that will be used to simulate multiple users and the pages in Confluence that they will access.
First, click on the Thread Group in the tree view on the left, rename it as you like and configure 15
simultaneous users for this experiment. Also set the Loop count to some value, lets say 50. From the image
below, you can see I called my Thread Group, My precious users and set 15 users with a loop count of 50.
This means Ill expect 15 threads to each perform an action (yet to be set up) 50 times each before ending.

All rights reserved. 2015 ZeroTurnaround Inc.


Now, were going to add an HTTP request

sampler that will configure our user threads to
query certain pages within Confluence. In order
to do this you need to make sure you havent
drunk too much caffeine, as this action will
require a steady hand as you can see from the
following image.

All rights reserved. 2015 ZeroTurnaround Inc.


Wow, we made it! On the next screen we can

configure the request sampler, The important
bits here include the Server Name or IP, Port
Number and Path. If youre following along, you
can set your server name and port number
to whatever values your server name and
port number are! Make sure to set the path

All rights reserved. 2015 ZeroTurnaround Inc.

to an existing Confluence page, were just

using the welcome page at /display/ds/

Welcome+to+Confluence. All we need is some

load to view results, so we dont need anything
more complex than this setup for now. Youll
likely need to invoke more complex paths based
on your application.

Also, make sure JMeter downloads all HTML

resources from the response pages by enabling
the check box shown below:


The last configuration step we need to go

through is for the reporting stage. In the same
way as we added the HTTP request sampler,
we should add an Aggregate Report to our test
plan, as shown by the equally steady handed
option to the right.
The Aggregate report view will show usdata
about our test runs including the average,
90% and 99% latencies, error rates and the
throughput of the application.

All rights reserved. 2015 ZeroTurnaround Inc.


Now its time to execute the test plan weve

created. Click on the green play button on, and
sit back for a minute while the test runs. Maybe
you could contemplate how long it would take
you to build a house of cards, while blindfolded.

When our sample test completes, the machine

hangs for a moment while showing a green
square signifying that the tests are doing
something magical. Dont ask, a magician
never reveals their secrets. If we check the top
command output, we can see that the machine
is indeed under duress here. The average load is
higher than on the idle machine.

Back to JMeter, go totree view on the left side and click on the Aggregate Report, which we set up before we
ran the test. It provides us with the valuable insight of how quickly the requests were served by Confluence.

All rights reserved. 2015 ZeroTurnaround Inc.


We see that our average response time was 1281 ms, but when
measuring latency you should not worry about averages. The
information that is really valuable is the 95% or 99% line, which will show
how much time the majority of your end-users will have to wait in this
scenario to get their response. The average is too susceptible to the outliers
and many quick responses will lower the value significantly. On the other
hand if the functional requirements or SLA for the system is specified for all
the users, the 99% line will be much more helpful to determine if the system
under test is close to meeting those requirements.

First, download YourKit from the YourKit website. Since it is a native
application we dont need to configure anything, we can simply run the
profiler. We do of course need to register for an evaluation license that will
be delivered via email . Once this is done you can enter the license key into
YourKIt when prompted and start using the tool.

In this case the throughput of the application, or said another way, the
number of users that can be served concurrently, is 6.5 requests per second.
We have now established a rough baseline for our application performance
on this set of hardware in this particular environment. Of course the
approach we took here is simplistic for the sake of readability, but in real life
you can configure much more complex test cases in a very similar fashion.
The nuts and bolts all look very similar, you just need to add more HTTP
request samplers to pages, make each user login before starting to query
your application and so forth.
Lets move forward, with our new baseline, and look at some of the other
performance profiling tools discussed in this report. Well run them against
Confluence and generate the load with the same JMeter test we have used
initially, so our profiler results will provide more meaningful data.

All rights reserved. 2015 ZeroTurnaround Inc.


The first time YourKit is launched, it offers to install an IDE plugin. YourKit knows that it offers the most value from the
IDE plugin, so we integrated it with a local instance of Eclipse. Note that it is not necessary to run the profiling itself from
within an IDE. In fact, the main YourKit application can connect to local and remote JVM processes utilising the javaagent
capabilities. So, we attach the profiler to our Confluence process, from within Eclipse as shown here:

Wow, we immediately see the CPU consumption by the app and some information about threads in the target JVM!

All rights reserved. 2015 ZeroTurnaround Inc.


The options that YourKit offers for CPU profiling

are very thorough and easily accessible from the
toolbar displayed in the IDE plugin, so now were
all set to start profiling Confluence with the load
JMeter emulates.
Since we are generating load for the application
we opt forsampling profiling, as it will impose
a lower overhead on the application and
should be accurate enough for our educational
purposes. However sampling profiling may not
be enough for your needs as it could potentially
miss spikes of activity, if the sample time
doesnt line up with your heavy usage. When
the JMeter test completes, we see the recorded
throughput is approximately 30% worsethan
our first run, or our baseline:

All rights reserved. 2015 ZeroTurnaround Inc.


We now have the profile data, we cananalyze exactly what takes Confluence the time to respond with the Welcome to
Confluence page. YourKit immediately found an unresponsive Thread and showed us a notification, suggesting that it
might be deadlocked. However, a quick check of the top output suggests that my laptop is close to going into a coma,
so this is probably not an issue of locking but more likely just insufficient resources to run all these apps. We can save the
data which YourKit recorded as a snapshot and dig into this further.

All rights reserved. 2015 ZeroTurnaround Inc.


Now we can perform some CPU profiling to find the source of any latency, by following
which methods had the most CPU time.

All rights reserved. 2015 ZeroTurnaround Inc.


The Hot Spots view shows the most time consuming methods in all of our collected data.
This is a great place to start to find the most likely candidates which we could look at.
Surprisingly we can see the YourKit probe class right on the top, but we blame that on the issues with
the experimental setup. Other hot methods are legit.

All rights reserved. 2015 ZeroTurnaround Inc.


Other views in YourKit show us a bunch of method calls which can either be grouped by Thread
or not grouped at all. This will require more time to analyze the results, however. Also, below the
CPU profiling views, theres a Java EE statistics window, where we can look at the SQL queries that
were executed from the Confluence process and other metrics. These are all nicely aggregated by
the consumed time and the query count and might be a source of interesting findings about your
application performance.

All rights reserved. 2015 ZeroTurnaround Inc.


Another immediate thing to notice without much digging is that YourKit saw around 6000 Exceptions
being generated:

This might be alright, but then again, these can probably be avoided.
In addition to the comprehensive CPU profiling, YourKit offers other insights into application
performance. They are as intuitive and straightforward to start with as the CPU profiling was and
again, theyre available right there in the UI:

All rights reserved. 2015 ZeroTurnaround Inc.


As fans of download graphs showing the torrents of various Linux

distributions, we like all kinds of running charts. So on the Performance
Charts tab we feel right at home, surrounded by useful information that
YourKit provides about our local Confluence instance. Had webeen hunting
a particular performance issue, wed know how to pinpoint it very quickly.
Solving it might be a different question however, but finding the root cause
is much easier with an accurate profiler to hand.

XRebel is a lightweight Java profiler, and occupies a different niche in the
category of profiling tools. It is intended primarily as a developer profiler
to spot possible performance issues as soon as possible, in fact, while a
developer is coding them. When a component of the system is just being
developed and functionally tested, XRebel is on hand to give warnings and
helpful diagnostics.

We did this on our Confluence installation by adding the

following line which specifies the javaagent parameter in the
confluence/bin/ file:
atlassian-confluence-5.7.4/xrebel/xrebel.jar ${CATALINA_
Restarting Confluence and viewing the application in a browser
automatically shows the XRebel toolbar which has been rendered into the
request as plain HTML. This signifies XRebel has attached itself successfully.
A modal window also appears asking the user to register XRebel.

It can help to pinpoint several types of performance bottlenecks, for

example it really shines at keeping count of the number of database
accesses, SQL queries and external HTTP requests executed from the
application. Plus if youre using NoSQL, XRebels got you covered too.
To obtain XRebel, youll need to first visit the download page, and grab
the zip archive. After extracting it, youll find the xrebel.jar file. XRebel is a
javaagent, so to instrument the application we need to tell our JVM runtime
how to use the XRebel jar via a -javaagent JVM parameter.

All rights reserved. 2015 ZeroTurnaround Inc.


First of all, the page took 1.8 seconds to load, which might be acceptable, but triggers a threshold
in the default configuration of XRebel. We can change it later to take account of the speed of
Confluence on this particular machine, but right now clicking on the Application profiling icon gives us
a list of HTTP requests that the page has initiated upon being loaded:

After activation successfully completes,

XRebel immediately notifies us about some
inefficiencies that it has found right on the
very first page!

All rights reserved. 2015 ZeroTurnaround Inc.


Clicking on the
GET /display/ds/Welcome+to+Confluence
request gives more detail and shows the code
execution path that handled the response. The
layout describes the the total cumulative time
as well as the time spent in a particular method.
This gives valuable hints to the areas of the
code that are behave unexpectedly.
We can see here that 36.5% of the total
request serving time was spent in the
BaseWebAppDecorator.render code,

and 40.4% of the time in VelocityUtils.


All rights reserved. 2015 ZeroTurnaround Inc.


While this information may not translate to

literally the production environment due to the
application running on different hardware, JVM
setting etc., it certainly increases the visibility into
what is happening in the application and makes
the developer much more aware of application
performance and possible issues. Also, by
offering relative information as well as absolute
information (i.e. percentages as well as seconds)
you can certainly see which areas will likely cause
you problems in your production environment,
well ahead of some of the other profilers.

All rights reserved. 2015 ZeroTurnaround Inc.

Another good example of XRebels insight into

the applications performance is the second
notification that the page generated: too many
SQL queries executed in the request. In fact,
XRebel counted 43 SQL queries executed
during the response generation to the
Welcome to Confluence page and visualises
the code execution path with the query
execution highlighted.


We can also view by SQL query, and youll notice

that XRebel groups similar queries together
so you can easily spot when larger number of
queries are made to the same tables. Here we
can see 14 queries were executed against the
CONTENTPROPERTIES table and a further 16 to

the BANDANA table.

Then again, this might be the intended flow for

the application so youre able to change XRebel
thresholds very easily in the configuration,

All rights reserved. 2015 ZeroTurnaround Inc.

from the same toolbar to ignore such a low SQL

straightforward way significantly decreases the

query count or response duration. In this case it

might be considered a bit excessive for so many
database interactions just to render a simple
and not very informative wiki page. At least
XRebel thinks so.

chances that issues like this will be make their

way to your production environments and your
end users.

Inefficient database access patterns are one

of the top performance degraders in typical
applications, so having this information served
right at the developer's monitorin a simple and

XRebel certainly might seem less useful for

the classical forensic job of the profiler, but its
immediate feedback to the code makes it
a great lightweight profiler for developers.


ice tant,
W he hats re s. hm r m
If you dig a hole and it's in the wrong place
we w
digging it deeper is not going to help.
Before your start tackling performance problems in
your project, be sure to recognise your goals and pick
the appropriate tool to help you accomplish them.

All rights reserved. 2015 ZeroTurnaround Inc.


Well done for making it all the way to the summary! We hope you loved the
report... well, of course you loved the report! Oh, hang on, you skipped the
content and jumped straight to the summary? Lazy! ;)
In this report, we covered what we mean by the term performance and the
fact that it can mean different things to different people, from removing
unnecessary code, to redesigning your application to using different
frameworks. Oh and of course you can still update your actual source code
to be more performant but that was an area we didnt cover this time.
Dont forget, your overall application performance will affect your
companys bottom line. Performance is important, but extremely hard to
grasp and master (if anyone even has mastered it). There are a myriad of
performance tools available and they all try to tackle performance problems
from different angles. We covered a number of them in this report but of
course there are many more out there too, most of which are established
with similar great features. But which one is for you? Well, as usual, the
answer is it depends! There are various things which will affect your
decision including the overall design of your application, the phase in which
youre testing, your personal preference and of course the type of issue
youre trying to track down, assuming you know what that issue is!

All rights reserved. 2015 ZeroTurnaround Inc.

One of the fun parts in the report (certainly for us), which allowed us to
get our teeth into some tech, was the practical part. Were geeks too you
know! We took the Confluence application and used JMeter to simulate load
across the application, which gave us our baseline throughput. From here
we profiled the application using YourKit and XRebel, which showed some
really interesting results, particularly when XRebel was enabled as it showed
up potential latency issues and database IO issues before we even had time
to say OMG, XRebel rocks!.
There is no silver bullet in solving performance problems, and you have
to choose your tools depending on the needs of your project. So, how do
you decide which tool is most relevant for your needs? Well, as were such
giving people here at RebelLabs weve created a small FAQ section that
summarizes some of the points from this report in a friendly, problem
oriented manner. This will hopefully aid your decision making to select the
right tool for you.


Q: I need to recognise the bottlenecks in my current production

deployment to know which resources to scale for additional
throughput. How should I do it?
A: This is a very common task to recognize the weakest part of the
production environment. Youll need a high level overview of the
deployment as a whole. Monitoring tools and APMs are the way to go
here. You should look at NewRelic, AppDynamics, Dynatrace and the
like. They can give you the big picture you need.
Q: How can I monitor my slowest business transactions to verify
that they still adhere to my SLA?
A: A business transaction is an action that goes through the entire
system or several systems. Either way its a users end to end request.
Youll need to monitor all components to determine the latency and
as a result your clear choices are APMs: NewRelic, AppDynamics,
Q: I need to make page /checkout-cart load faster, because its
a critical path for the business and making it faster will increase
revenue. What should I do?
A: First of all, create a load test, using JMeter or Gatling, that simulates
the real deployment more or less, with large amounts of data, multiple
concurrent users and so forth. Record your baseline so you know
how it performs currently. Next, collect a performance profile with a
profiler, like YourKit or JProfiler. Now, the data collected should tell
you which parts of the code execution take the longest time. Focus
on the optimisations that can give the largest performance boost and
verify the fixes against the tests created earlier.

All rights reserved. 2015 ZeroTurnaround Inc.

Q: My application is slow, and the database administrator says it

queries the database too much. How can I confirm this and work out
how to optimise it?
A: The most natural solution for finding slow database queries would
be to install an APM that collects that information for you. Run it for
some time on the production system and itll give you a list of the
slowest SQL or NoSQL queries your code produces. NewRelic and
AppDynamics are good choices for an easy to configure APM solution.
To make sure developers dont add these kinds of issues into the
code base in future, consider XRebel to eliminate these as cheaply as
possible, during development time.
Q: We added a widget requesting a weather forecast from a third
party service. Now my page is going to load slower. Should I worry?
How do I find out how much slower its going to be?
A: Yes, you should worry. Performance isnt something always
considered when introducing new services and features, particularly
third party services as often theyre quick and easyto integrate. To
figure out how it will affect your new timings, you should setup a
stress test using JMeter or Gatling, or some other performance test
library. You can also use XRebel right during the development time,
so youll be constantly aware of possible performance issues with
application profiling.


Q: I want to make sure that the code I write wont make my

application slower. What are best practices for that?
A: Hopefully you have some sort of performance tests running in your
CI environment, so that youre monitoring the behavior of your code
under load before it reaches the end users. An even faster reaction
to code that degrades performance can be achieved by using XRebel,
which will notify you about several types of performance bugs as you
develop the code.

Q: I think theres unused parallelism in my system, how do I find out

if I configured thread pools correctly and use all the resources I have
to their fullest?
A: One of the solutions that not only tries to find a performance
problem in your code, but also suggests how to fix it is Illuminate by
JClarity. This tuned performance diagnostic engine can advise you on
how to configure your system more appropriately for the load that it

Q: The operations team say that our system is consuming too much
RAM and crashes with OutOfMemory errors constantly? How do I
find out which part of the system is saturating the heap?
A: There are several solutions that promise to find memory leaks,
heavier APMs include NewRelic, Dynatrace and AppDynamics.
Other, more specific tools that identify and handle performance issues
related to memory usage issues like Plumbr or Illuminate might be
much more straight forward at showing the root cause of the problem.

Q: I want to rigorously profile my code, because we have a very lowlevel implementation of a queue interface. But I heard that profilers
are biased toward safepoints whatever that means? Are they?
A: Some operations that the JVM performs, like rearranging objects
on the heap, require application threads to be paused, these pauses
are called safepoints. The usual sampling approach to profiling Java
applications is indeed biased towards safepoints. You can enhance the
precision of the timings by using the tracing instrumentation profiling
mode. YourKit and JProfiler for example offer you that option. On
the other hand you can try to profile your application using Honest
profiler, which was created specifically to avoid the problems with the
usual sampling algorithms.

Q: I suspect the system has several memory leaks because it gets

much slower on the second day after the restart. Where should
I look?
A: Plumbr specializes in finding memory leaks and you can definitely
try to tackle the memory hog issue by using it. Also, remember that
the assumptions we make about performance are often incorrect and
proper monitoring of the whole system is advisable prior to taking any
optimization actions. Cause and effect is a tricky problem and involves
a moving target.

All rights reserved. 2015 ZeroTurnaround Inc.


Q: My team are not all performance experts, but they all contribute
the same amount of code. They dont have time to sift through lots
of data. Is there a simple tool that gives simple feedback for regular
developers to understand?
A: XRebel is one of the easiest profilers for Java applications, it injects
itself right into your application and gives you a simple outline of
where your application spends time when serving requests. It can
also highlight typical performance problems with excessive database
accesses, abnormally large sessionsand so forth. The setup and ease
of use of XRebel are unmatched.
Q: I need to establish a baseline, so my aggressive refactoring wont
decrease application performance. How do I achieve this?
A: You want to look for load test libraries, like JMeter or Gatling. They
allow you to record the interaction with the application and can rerun
it later using multiple concurrent users, thus simulating real world
usage patterns.

All rights reserved. 2015 ZeroTurnaround Inc.



We really enjoyed writing this report and
we really hope you enjoyed reading it. Dont
worry if the world of performance still scares
or intimidates you, its the same for most
of us! Hopefully this report has given you a
glimpse of what you can do with some of the
performance tools and youll take a look at a
few of them, particularly XRebel! Please be a
good citizen and share this report with your
friends, family and of course, your pets. Well
leave you with this XKCD comic that tells us
about the importance of performant code.

All rights reserved. 2015 ZeroTurnaround Inc.


All rights reserved. 2015 ZeroTurnaround Inc.


Co ntac


Twitter: @RebelLabs
likooli 2, 4th floor
Tartu, Estonia, 51003
Phone: +372 653 6099

399 Boylston Street,
Suite 300, Boston,
MA, USA, 02116
All rights reserved. 2015 ZeroTurnaround
Inc. 1 (857) 277-1199

Czech Republic
Jankovcova 1037/49
Building C, 5th floor,
170 00 Prague 7, Czech Republic
Phone: +420 227 020 130

Written by:
Oleg Shelajev (@shelajev), Simon Maple (@sjmaple)
Designed by: Ladislava Bohacova (@ladislava)