You are on page 1of 55

IBM Software Group

Java Garbage Collection


Best Practices for Sizing and Tuning the Java Heap

Chris Bailey

WebSphere Support Technical Exchange


IBM Software Group

Objectives
Overview

Selecting the Correct GC Policy

Sizing the Java heap

Questions/Answers

WebSphere Support Technical Exchange 2


IBM Software Group

Garbage Collection Performance


GC performance issues can take many forms
Definition of a performance problem is user centric
User requirement may be for:
Very short GC pause times
Maximum throughput
A balance of both

First step is ensure that the correct GC policy has been selected for
the workload type
Helpful to have an understanding of GC mechanisms

Second step is to ensure heap sizing is correct


Third step us to look for specific performance issues

WebSphere Support Technical Exchange 3


IBM Software Group

Selecting the Correct GC Policy

WebSphere Support Technical Exchange


IBM Software Group

Understanding Garbage Collection


Responsible for allocation and freeing of:
Java objects, Array objects and Java classes

Allocates objects using a contiguous section of Java heap

Ensures the object remains as long as it is in use or live


Determination based on a reference from another live object or
from outside of the Heap

Reclaims objects that are no longer referenced

Ensures that any finalize method is run before the object is


reclaimed

WebSphere Support Technical Exchange 5


IBM Software Group

Object Allocation
Requires a contiguous area of Java heap
Driven by requests from:
The Java application
JNI code

Most allocations take place in Thread Local Heaps (TLHs)


Threads reserve a chunk of free heap to allocate from
Reduces contention on allocation lock
Keeps code running in a straight line (fewer failures)
Meant to be fast
Available for objects < 512 bytes in size

Larger allocates take place under a global heap lock


These allocations are one time costs out of line allocate
Multiple threads allocating larger objects at the same time will
contend

WebSphere Support Technical Exchange 6


IBM Software Group

Object Reclamation (Garbage Collection)


Occurs under two scenarios:
An allocation failure
An object allocation is requested and not enough contiguous memory is available
A programmatically requested garbage collection cycle
call is made to System.GC() or Runtime.GC()
the Distributed Garbage Collector is running
call to JVMPI/TI is made

Two main technologies used to remove the garbage:


Mark Sweep Collector
Copy Collector

IBM uses a mark sweep collector


or a combination for generational

WebSphere Support Technical Exchange 7


IBM Software Group

Global Collection Policies


Garbage Collection can be broken down into 2 (3) steps
Mark: Find all live objects in the system
Sweep: Reclaim unused heap memory to the free list
Compact: Reduce fragmentation within the free list

All steps are in a single stop-the-world (STW) phase


Application pauses whilst garbage collection is done

Each step is performed as a parallel task within itself

Four GC Policies, optimized for different scenarios


-Xgcpolicy:optthruput optimized for batch type applications
-Xgcpolicy:optavgpause optimized for applications with responsiveness
criteria
-Xgcpolicy:gencon optimized for highly transactional workloads
-Xgcpolicy:subpools optimized for large systems with allocation
contention

WebSphere Support Technical Exchange 8


IBM Software Group

Parallel GC (optthruput)
Parallel Mark Sweep Collector, with compaction avoidance
Created to make use of additional processors on server systems
Designed to increase performance for SMP and not degrade
performance for uni-processor systems

Optimized for Throughput


Best policy for batch type applications

Consists of a single flat Java heap:

0 GB 2 GB

LOA

Heap Size
Heap Base Heap Limit

WebSphere Support Technical Exchange 9


IBM Software Group

GC Helper Threads
Parallelism achieved through the use of GC Helper Threads
Parked set of threads that wake to share GC work
Main GC thread generates the root set of objects
Helper threads share the work for the rest of the phases
Number of helpers is one less than the number of processing
units
So helper threads and main GC thread equals the number of
processing units
Configurable using -Xgcthreads

WebSphere Support Technical Exchange 10


IBM Software Group

Parallel Mark/Parallel Sweep view of GC

WebSphere Support Technical Exchange 11


IBM Software Group

Concurrent GC (optavgpause)
Reduces and makes more consistent the time spent inside Stop the
World GC
Reduction usually between 90 and 95%

Achieved by carrying out some of the STW work whilst application is


running
1.4.2: Concurrent Marking
5.0: Concurrent Marking and Concurrent Sweeping

Slight overhead on thruput for greatly reduced STW times


Policy is ideal for systems with responsiveness criteria
eg. Portal applications

WebSphere Support Technical Exchange 12


IBM Software Group

Parallel and Concurrent Mark/Sweep

Concurrent Kickoff

WebSphere Support Technical Exchange 13


IBM Software Group

Concurrent Mark hidden object issue

Higher heap usage

WebSphere Support Technical Exchange 14


IBM Software Group

Concurrent Mark hidden object issue

Higher heap usage

Dangling pointer!

because not all garbage removed

WebSphere Support Technical Exchange 15


IBM Software Group

Generational and Concurrent GC (gencon)


Similar in concept to that used by Sun and HP
Parallel copy and concurrent global collects by default

Motivation: Objects die young so focus collection efforts on


recently created objects
Divide the heap up into a two areas: new and old
Perform allocates from the new area
Collections focus on the new area
Objects that survive a number of collects in new area are
0 GB promoted to old area (tenured) 2 GB

Allocate Survivor LOA


Nursery (new) Space Tenured (old) Space
Heap Base Heap Size Heap Limit

Ideal for transactional and high data throughput workloads

WebSphere Support Technical Exchange 16


IBM Software Group

Nursery (new) Space Copy Collection


Nursery/Young Generation

Allocate Space
Survivor Space Survivor
Allocate Space
Space

Nursery is split into two spaces (semi-spaces)


Only one contains live objects and is available for allocation
Minor collections (Scavenges) move objects between spaces
Role of spaces is reversed
Movement results in implicit compaction

WebSphere Support Technical Exchange 17


IBM Software Group

Subpooling (subpool)
Goals:
Reduce allocation lock contention by distributing free memory into
multiple lists
Reduce allocation contention through use of atomic operations instead of
a heap lock
Prevent premature garbage collections by using a best fit (or closer to
best fit) policy instead of address ordered

Ideal for very large SMP systems where large amounts data is being
allocated
where there is heap lock contention

WebSphere Support Technical Exchange 18


IBM Software Group

Looking for Heap Lock Contention


All locks can be profiled using Java Lock Analyzer (JLA)
http://www.alphaworks.ibm.com/tech/jla
(AlphaWorks)
Provides time accounting and contention statistics for
Java and JVM locks
Functionality includes:
Counters associated with contended locks
Total number of successful acquires
Recursive acquires times a thread acquires a lock it
already owns
Number of times a thread blocks because a monitor is
already owned
Cumulative time the monitor was held.

WebSphere Support Technical Exchange 19


IBM Software Group

JLA Sample Report


System (Registered) Monitors
%MISS GETS NONREC SLOW REC TIER2 TIER3 %UTIL AVER-HTM MON-NAME
87 5273 5273 4572 0 710708 18487 1 95408 JITC Global_Compile lock
9 6870 6869 631 1 113420 2976 0 11807 Heap lock
5 1123 1123 51 0 11098 286 1 248385 Binclass lock
0 1153 1147 5 6 1307 33 0 47974 Monitor Cache lock
0 46149 45877 134 272 36961 877 1 6558 JITC CHA lock
0 33734 23483 19 10251 6544 150 1 17083 Thread queue lock
0 5 5 0 0 0 0 0 9309689 JNI Global Reference lock
0 5 5 0 0 0 0 0 9283000 JNI Pinning lock
0 5 5 0 0 0 0 0 9442968 Sleep lock
0 1 1 0 0 0 0 0 0 Monitor Registry lock
0 0 0 0 0 0 0 0 0 Evacuation Region lock
0 0 0 0 0 0 0 0 0 Method trace lock
0 0 0 0 0 0 0 0 0 Classloader lock
0 0 0 0 0 0 0 0 0 Heap Promotion lock
Java (Inflated) Monitors
%MISS GETS NONREC SLOW REC TIER2 TIER3 %UTIL AVER-HTM MON-NAME
15 68 68 10 0 2204 56 2 11936405 test.lock.testlock1@A09410/A09418
2 42 42 1 0 186 5 0 300478 test.lock.testlock2@D31358/D31360
0 70 70 0 0 41 1 0 7617 java.lang.ref.ReferenceQueue$Lock@920628/920630

WebSphere Support Technical Exchange 20


IBM Software Group

JLA: Fields in the report

WebSphere Support Technical Exchange 21


IBM Software Group

Choosing the Right GC Policy


Four GC Policies, optimized for different scenarios
-Xgcpolicy:optthruput optimized for batch type applications
-Xgcpolicy:optavgpause optimized for applications with
responsiveness criteria
-Xgcpolicy:gencon optimized for highly transactional
workloads
-Xgcpolicy:subpools optimized for large systems with allocation
contention

How do I know whether to use optavgpause or gencon?


Monitor GC activity
Look for certain characteristics

WebSphere Support Technical Exchange 22


IBM Software Group

Monitoring GC Activity
Use of Verbose GC logging
only data that is required for GC performance tuning
Graph Verbose GC output using GC and Memory Visualizer (GCMV) from ISA

Activated using command line options


-verbose:gc
-Xverbosegclog:[DIR_PATH][FILE_NAME],X,Y
where:
[DIR_PATH] is the directory where the file should be written
[FILE_NAME] is the name of the file to write the logging to
X is the number of files to
Y is the number of GC cycles a file should contain

Performance Cost:
(very) basic testing shows a 2% overhead for GC duration of 200ms
eg. if application GC overhead is 5%, it would become 5.1%

WebSphere Support Technical Exchange 23


IBM Software Group

Important Characteristics for Choosing GC Policy

Rate of Garbage Collection


High rates of object burn point to large numbers of transitional objects, and
therefore the application may well benefit from the use of gencon

Large Object Allocations?


The allocation of very large objects adversely affects gencon unless the nursery is
sufficiently large enough. The application may well benefit from optavgpuse

Large heap usage variations


The optavgpause algorithms are best suited to consistent allocation profiles
Where large variations occur, gencon may be better suited

Rule of thumb: if GC overhead is > 10%, youve most likely chosen the wrong one

WebSphere Support Technical Exchange 24


IBM Software Group

Rate of Garbage Collection


optavgpause gencon

Gencon could handle a higher rate of garbage collection


Completing the test quicker
Gencon had a smaller percentage of time in garbage collection
Gencon had a shorter maximum pause time

WebSphere Support Technical Exchange 25


IBM Software Group

Rate of Garbage Collection

Gencon provides less frequent long Garbage Collection cycles


Gencon provides a shorter longest Garbage Collection cycle

WebSphere Support Technical Exchange 26


IBM Software Group

Large Object Allocations


(Very) Large Object allocations affects the gencon GC policy
If object is larger than the Nursery size, the object is immediately tenured
Removes the benefit of generational heaps
Still has the additional overhead of running generational

If object is fits in the nursery but fills it, frequent nursery collects will have to occur
Too frequent nursery collects mean objects are likely to survive and need copying
Copying is an expensive process

If (Very) Large Objects are being used, a sufficiently large enough nursery is required

WebSphere Support Technical Exchange 27


IBM Software Group

Sizing the Java Heap

WebSphere Support Technical Exchange


IBM Software Group

Sizing the Java Heap


Maximum possible Java heap sizes

The correct Java heap size

Fixed heap sizes vs. Variable heap sizes

Heap Sizing for Generational GC

WebSphere Support Technical Exchange 29


IBM Software Group

Maximum Possible Heap Size


32 bit Java processes have maximum possible heap size
Varies according to the OS and platform used
Determined by the process memory layout

64 bit processes do not have this limit


Limit exists, but is so large it can be effectively ignored
Addressability usually between 2^44 and 2^64
Which is 16+ TeraBytes

WebSphere Support Technical Exchange 30


IBM Software Group

Java Process Memory Layout


An Operating System process like any other application:
Subject to OS and architecture restrictions
32bit architecture has an addressable range of:
2^32 which is 0x00000000 0xFFFFFFFF
which is 4GB
0 GB 2 GB 4 GB
1 GB 3 GB

0x40000000 0xC0000000
0x0 0x80000000 0xFFFFFFFF

Not all addressable space is available to the application


The operating system needs memory for:
The kernel
The runtime support libraries

Varies according to Operating System


How much memory is needed and where that memory is located

WebSphere Support Technical Exchange 31


IBM Software Group

Memory Available to the Java Process


On Windows:
0 GB 2 GB 4 GB
1 GB 3 GB

Operating System Space

0x40000000 0xC0000000
0x0 0x80000000 0xFFFFFFFF
Libraries

On AIX:

0 GB 2 GB 4 GB
1 GB 3 GB

Kernel Libraries

0x40000000 0xC0000000
0x0 0x80000000 0xFFFFFFFF

WebSphere Support Technical Exchange 32


IBM Software Group

Java Process Restrictions


Not all Java Process space is available to the Java application
The Java Runtime needs memory for:
The Java Virtual Machine
Backing resources for some Java objects

This memory area as well as some other allocations, is part of the


Native Heap

Memory not allocated to the Java Heap is available to the native heap

Available memory space Java heap = native


heap
Effectively, the Java process maintains two memory pools

WebSphere Support Technical Exchange 33


IBM Software Group

The Native Heap


Allocated using malloc() and therefore subject to memory
management by the OS

Used for Virtual Machine resources, eg:


Execution engine
Class Loader
Garbage Collector infrastructure

Used to underpin Java objects:


Threads, Classes, AWT objects, ZipFiles

Used for allocations by JNI code

WebSphere Support Technical Exchange 34


IBM Software Group

Native Heap available to Application


On Windows
0 GB 2 GB 4 GB
1 GB 3 GB

Java Heap Native Heap Operating System Space

0x40000000 0xC0000000
0x0 0x80000000 0xFFFFFFFF
VM Resources Libraries

On AIX (1.4.2 with small heaps)

0 GB 2 GB 4 GB
1 GB 3 GB

Kernel Java Heap Native Heap Libraries

0x40000000 0xC0000000
0x0 0x80000000 0xFFFFFFFF

VM Resources

WebSphere Support Technical Exchange 35


IBM Software Group

Layout with Large Java Heaps on AIX


Applies to heaps > 1GB in size and Java 5.0

Java heap becomes allocated using mmap()

Segments used start at 0xC and work downwards


understanding memory layout important for monitoring

0 GB 2 GB 4 GB
1 GB 3 GB
0x3 0x7 0xD

Kernel Native Heap Java Heap Libraries

0x40000000 0xC0000000
0x0 0x80000000 0xFFFFFFFF

VM Resources

WebSphere Support Technical Exchange 36


IBM Software Group

Memory Layout for Linux


Linux:
0 GB 2 GB 4 GB
1 GB 3 GB

Java Heap Native Heap Kernel

0x40000000 0xC0000000
0x0 0x80000000 0xFFFFFFFF
VM Resources
TASK_SIZE PAGE_OFFSET

z/OS:

0 GB 2 GB
1 GB

Java Heap

0x40000000
0x0 0x7FFFFFFF
VM Resources

WebSphere Support Technical Exchange 37


IBM Software Group

Theoretical and Advised Max Heap


Sizes The larger the Java heap, the more constrained the native heap
Advised limits to prevent native heap from becoming overly
restricted, leading to OutOfMemoryErrors
Platform Additional Options Maximum Possible Advised Maximum
AIX automatic 3.25 GB 2.5GB
Linux 2 GB 1.5GB
Hugemem Kernel 3 GB 2.5GB
Windows 1.8GB 1.5GB
/3GB 1.8GB 1.8GB
z/OS 1.7GB 1.3GB

Exceeding advised limits possible, but should be done only when


native heap usage is understood
Native heap usage can be measured using OS tools:
Svmon (AIX), PerfMon (Windows), RMF (zOS) etc

WebSphere Support Technical Exchange 38


IBM Software Group

Moving to 64bit
Moving to 64bit remove the Java heap size limit

However, ability to use more memory is not free


64bit applications perform slower
More data has to be manipulated
Cache performance is reduced
64bit applications require more memory
Java Object references are larger
Internal pointers are larger

Major improvements to this in Java 6.0 due to compressed pointers

WebSphere Support Technical Exchange 39


IBM Software Group

The correct Java heap size


GC will adapt heap size to keep occupancy between 40% and 70%
Heap occupancy over 70% causes frequent GC cycles
Which generally means reduced performance
Heap occupancy below 40% means infrequent GC cycles, but cycles
longer than they needs to be
Which means longer pause times that necessary
Which generally means reduced performance

The maximum heap size setting should therefore be 43% larger than the
maximum occupancy of the application
Maximum occupancy + 43% means occupancy at 70% of total heap
Eg. For 70MB occupancy, 100MB Max heap required, which is 70MB +
43% of 70MB

WebSphere Support Technical Exchange 40


IBM Software Group

The correct Java heap size Heap Size

Too Frequent Garbage Collection


70%
Memory

Heap Occupancy

40%

Long Garbage Collection Cycles

Time

WebSphere Support Technical Exchange 41


IBM Software Group

Fixed heap sizes vs. Variable heap sizes


Should the heap size be fixed?
i.e. Minimum heap size (-Xms) = Maximum heap size (-Xmx)?

Each option has advantages and disadvantages


As for most performance tuning, you must select which is right for the particular
application

Variable Heap Sizes


GC will adapt heap size to keep occupancy between 40% and 70%
Expands and Shrinks the Java heap
Allows for scenario where usage varies over time
Where variations would take usage outside of the 40-70% window

Fixed Heap Sizes


Does not expand or shrink the Java heap

WebSphere Support Technical Exchange 42


IBM Software Group

Heap Expansion and Shrinkage


Act of heap expansion and shrinkage is relatively cheap

However, a compaction of the Java heap is sometimes required


Expansion: for some expansions, GC may have already
compacted to try to allocate the object before expansion

Shrinkage: GC may need to compact to move objects from the


area of the heap being shrunk

Whilst expansion and shrinkage optimizes heap occupancy, it


(usually) does so at the cost of compaction cycles

WebSphere Support Technical Exchange 43


IBM Software Group

Conditions for Heap Expansion


Not enough free space available for object allocation after GC has
complete
Occurs after a compaction cycle
Typically occurs where there is fragmentation or during rapid
occupancy growth (i.e., application startup)

Heap occupancy is over 70%


Compaction unlikely

More than 13% of time is spent in GC


Compaction unlikely

WebSphere Support Technical Exchange 44


IBM Software Group

Conditions for Heap Shrinkage


Heap occupancy is under 40%

And the following is not true:


Heap has been recently expanded (last 3 cycles)
GC is a result of a System.GC() call

Compaction occurs if:


An object exists in the area being shrunk
GC did not shrink on the previous cycle

Compaction is therefore likely to occur

WebSphere Support Technical Exchange 45


IBM Software Group

Introduction to Xmaxf and Xminf


The Xmaxf and Xminf settings control the 40% and 70% occupancy
bounds
-Xmaxf: the maximum heap space free before shrinkage (default is 0.6
for 40%)
-Xminf: the minimum heap space before expansion (default is 0.3 for
70%)

Can be used to move optimum occupancy window if required by the


application
eg. Lower heap utilization required for more infrequent GC cycles

Can be used to prevent shrinkage


-Xmaxf1.0 would mean shrinkage only when heap is 100% free
Would completely remove shrinkage capability

WebSphere Support Technical Exchange 46


IBM Software Group

Introduction to Xmaxe and -Xmine


The Xmaxe and Xmine settings control the bounds of the size of
each expansion step
-Xmaxe: the maximum amount of memory to add to the heap
size in the case of expansion (default is unlimited)
-Xmine: the minimum amount of memory to add to the heap
size in the case of expansion (default is 1MB)

Can be used to reduce/prevent compaction due to expansion


Reduce expansions by setting a large -Xmine

WebSphere Support Technical Exchange 47


IBM Software Group

GC Managed Heap Sizing


Heap Size

Expansion (>= -Xmine)

To Frequent Garbage Collection


-Xminf
Memory

Heap Occupancy

-Xmaxf

Long Garbage Collection Cycles

Time
WebSphere Support Technical Exchange 48
IBM Software Group

Fixed or Variable??
Again, dependent on application

For flat memory usage, use fixed


For widely varying memory usage, consider variable

Variable provides more flexibility and ability to avoid


OutOfMemoryErrors
Some of the disadvantages can be avoided:
-Xms set to lowest steady state memory usage prevents
expansion at startup
-Xmaxf1 will remove shrinkage
-Xminf can be used to prevent compaction before
expansion
-Xmine can be used to reduce expansions

WebSphere Support Technical Exchange 49


IBM Software Group

Heap Sizing for Generational GC


Options Are:
Fix both nursery and tenured space

Nursery Tenured

Allow them to expand/contract

General Advice:
Fix the new space size
Size the tenured space as you would for a flat heap

WebSphere Support Technical Exchange 50


IBM Software Group

Sizing the Nursery


Copying from Allocate to Survivor or to Tenured space is expensive
Physical data is copied (similar to compaction with is also expensive
Ideally survival rates should be as low as possible
Less data needs to be copied
Less tenured/global collects that will occur

The larger the nursery:


the greater the time between collects
the less objects that should survive
However, the longer a copy can potentially take

Recommendation is to have a nursery as large as possible


Whilst not being so large that nursery collect times affect the
application responsiveness

WebSphere Support Technical Exchange 51


IBM Software Group

Summary
GC Policy should be chosen according to application scenario

Java heap should ideally be sized for between 40 and 70%


occupancy

Min=Max heap size is right for some applications, but not for others

WebSphere Support Technical Exchange 52


IBM Software Group

Additional WebSphere Product Resources


Discover the latest trends in WebSphere Technology and implementation, participate in
technically-focused briefings, webcasts and podcasts at:
http://www.ibm.com/developerworks/websphere/community/

Learn about other upcoming webcasts, conferences and events:


http://www.ibm.com/software/websphere/events_1.html
Join the Global WebSphere User Group Community: http://www.websphere.org
Access key product show-me demos and tutorials by visiting IBM Education Assistant:
http://www.ibm.com/software/info/education/assistant

View a Flash replay with step-by-step instructions for using the Electronic Service
Request (ESR) tool for submitting problems electronically:
http://www.ibm.com/software/websphere/support/d2w.html
Sign up to receive weekly technical My support emails:
http://www.ibm.com/software/support/einfo.html

WebSphere Support Technical Exchange 53


IBM Software Group

Additional Java Product Resources


Obtain Java Documentation:
https://www.ibm.com/developerworks/java/jdk/docs.html

Download the IBM Java SDKs:


https://www.ibm.com/developerworks/java/jdk/index.html

Find and download Java tooling:


http://www.ibm.com/software/websphere/events_1.html
Troubleshoot Java with the IBM Guided Activity Assistant:
http://www-01.ibm.com/support/docview.wss?uid=swg27010135
Troubleshoot Java with the Guided Troubleshooting InfoCenter
http://publib.boulder.ibm.com/infocenter/javasdk/tools/topic/com.ibm.java.doc.tools.welc
ome/tools/welcome/welcome.html
Discuss IBM Java:
http://www.ibm.com/developerworks/forums/forum.jspa?forumID=367

WebSphere Support Technical Exchange 54


IBM Software Group

Questions and Answers

WebSphere Support Technical Exchange 55

You might also like