You are on page 1of 14

한국어 | Login | Register

About Downloads Documenta�on Dev Zone Community Blog

Understanding Java Garbage Collec�on


posted 4 years ago in Dev Pla�orm category by Sangmin Lee

Tweet 146 452

What a re the benefits of knowi ng how ga rba ge col l ec�on (GC) works i n Java ? Sa�sfyi ng the
i ntel l ectua l curi os i ty a s a s o�wa re engi neer woul d be a va l i d ca us e, but a l s o, understa ndi ng how
GC works ca n hel p you wri te much be�er Java a ppl i ca�ons .

Thi s i s a very pers ona l a nd s ubjec�ve opi ni on of mi ne, but I bel i eve that a pers on wel l vers ed i n
GC tends to be a be�er Java devel oper. If you a re i nterested i n the GC proces s , that mea ns you have
experi ence i n devel opi ng a ppl i ca�ons of certa i n s i ze. If you have thought ca reful l y a bout choos i ng
the ri ght GC a l gori thm, that mea ns you compl etel y understa nd the features of the a ppl i ca�on you
have devel oped. Of cours e, thi s may not be common sta nda rds for a good devel oper. However, few
woul d object when I s ay that understa ndi ng GC i s a requi rement for bei ng a great Java devel oper.

Thi s i s the first of a s eri es of "Become a Java GC Expert" a r�cl es . I wi l l cover the GC introduc�on thi s
�me, a nd i n the next a r�cl e, I wi l l ta l k a bout a na l yzi ng GC status a nd GC tuni ng exa mpl es from
NHN.

The purpos e of thi s a r�cl e i s to i ntroduce GC to you i n a n ea sy way. I hope thi s a r�cl e proves to be
very hel pful . Actua l l y, my col l ea gues have a l rea dy publ i s hed a few great a r�cl es on Java
Interna l s whi ch beca me qui te popul a r on Twi �er. You may refer to them a s wel l .

Returni ng ba ck to Ga rba ge Col l ec�on, there i s a term that you s houl d know before l ea rni ng a bout
GC. The term i s "stop-the-world." Stop-the-worl d wi l l occur no ma�er whi ch GC a l gori thm you choos e.
Stop-the-world mea ns that the JVM i s stoppi ng the a ppl i ca�on from runni ng to execute a GC. When
stop-the-worl d occurs , every threa d except for the threa ds needed for the GC wi l l stop thei r ta s ks .
The i nterrupted ta s ks wi l l res ume onl y a�er the GC ta s k ha s compl eted. GC tuni ng o�en mea ns
reduci ng thi s stop-the-worl d �me.

Genera�onal Garbage Collec�on

Java does not expl i ci tl y s peci fy a memory a nd remove i t i n the progra m code. Some peopl e s ets the
rel eva nt object to nul l or us e System.gc() method to remove the memory expl i ci tl y. Se�ng i t to nul l
i s not a bi g dea l , but ca l l i ng System.gc() method wi l l affect the system performa nce dra s�ca l l y, a nd
must not be ca rri ed out. (Tha nkful l y, I have not yet s een a ny devel oper i n NHN ca l l i ng thi s method.)

In Java , a s the devel oper does not expl i ci tl y remove the memory i n the progra m code, the ga rba ge
col l ector finds the unneces s a ry (ga rba ge) objects a nd removes them. Thi s ga rba ge col l ector wa s
created ba s ed on the fol l owi ng two hypothes es . (It i s more correct to ca l l them s uppos i �ons or
precondi �ons , rather tha n hypothes es .)

Most objects s oon become unrea cha bl e.


References from ol d objects to young objects onl y exi st i n s ma l l numbers .

Thes e hypothes es a re ca l l ed the weak genera�onal hypothesis. So i n order to pres erve the strengths
of thi s hypothes i s , i t i s phys i ca l l y di vi ded i nto two - young genera�on a nd old genera�on - i n HotSpot
VM.

Young genera�on: Most of the newl y created objects a re l ocated here. Si nce most objects s oon
become unrea cha bl e, ma ny objects a re created i n the young genera�on, then di s a ppea r. When
objects di s a ppea r from thi s a rea , we s ay a "minor GC" ha s occurred.

Old genera�on: The objects that di d not become unrea cha bl e a nd s urvi ved from the young
genera�on a re copi ed here. It i s genera l l y l a rger tha n the young genera�on. As i t i s bi gger i n s i ze,
the GC occurs l es s frequentl y tha n i n the young genera�on. When objects di s a ppea r from the ol d
genera�on, we s ay a "major GC" (or a "full GC") ha s occurred.

Let's l ook at thi s i n a cha rt.

Figure 1: GC Area & Data Flow.

The permanent genera�on from the cha rt a bove i s a l s o ca l l ed the "method area," a nd i t stores cl a s s es
or i nterned cha ra cter stri ngs . So, thi s a rea i s defini tel y not for objects that s urvi ved from the ol d
genera�on to stay perma nentl y. A GC may occur i n thi s a rea . The GC that took pl a ce here i s s�l l
counted a s a ma jor GC.

Some peopl e may wonder:


What if an object in the old genera�on need to reference an object in the young genera�on?

To ha ndl e thes e ca s es , there i s s omethi ng ca l l ed the a "card table" i n the ol d genera�on, whi ch i s a
512 byte chunk. Whenever a n object i n the ol d genera�on references a n object i n the young
genera�on, i t i s recorded i n thi s ta bl e. When a GC i s executed for the young genera�on, onl y thi s
ca rd ta bl e i s s ea rched to determi ne whether or not i t i s s ubject for GC, i nstea d of checki ng the
reference of a l l the objects i n the ol d genera�on. Thi s ca rd ta bl e i s ma na ged wi th write barrier. Thi s
write barrier i s a devi ce that a l l ows a fa ster performa nce for mi nor GC. Though a bi t of overhea d
occurs beca us e of thi s , the overa l l GC �me i s reduced.

Figure 2: Card Table Structure.

Composi�on of the Young Genera�on

In order to understa nd GC, l et's l ea rn a bout the young genera�on, where the objects a re created for
the first �me. The young genera�on i s di vi ded i nto 3 s pa ces .

One Eden s pa ce
Two Survivor s pa ces

There a re 3 s pa ces i n tota l , two of whi ch a re Survi vor s pa ces . The order of execu�on proces s of
ea ch s pa ce i s a s bel ow:

1. The ma jori ty of newl y created objects a re l ocated i n the Eden s pa ce.


2. A�er one GC i n the Eden s pa ce, the s urvi vi ng objects a re moved to one of the Survi vor s pa ces .
3. A�er a GC i n the Eden s pa ce, the objects a re pi l ed up i nto the Survi vor s pa ce, where other
s urvi vi ng objects a l rea dy exi st.
4. Once a Survi vor s pa ce i s ful l , s urvi vi ng objects a re moved to the other Survi vor s pa ce. Then, the
Survi vor s pa ce that i s ful l wi l l be cha nged to a state where there i s no data at a l l .
5. The objects that s urvi ved thes e steps that have been repeated a number of �mes a re moved to
the ol d genera�on.

As you ca n s ee by checki ng thes e steps , one of the Survi vor s pa ces must rema i n empty. If data exists
in both Survivor spaces, or the usage is 0 for both spaces, then ta ke that a s a s i gn that something is wrong
with your system.

The proces s of data pi l i ng up i nto the ol d genera�on through mi nor GCs ca n be s hown a s i n the
bel ow cha rt:
Figure 3: Before & A�er a GC.

Note that i n HotSpot VM, two techni ques a re us ed for fa ster memory a l l oca�ons . One i s ca l l ed
"bump-the-pointer," a nd the other i s ca l l ed "TLABs (Thread-Local Alloca�on Buffers)."

Bump-the-pointer techni que tra cks the l a st object a l l ocated to the Eden s pa ce. That object wi l l be
l ocated on top of the Eden s pa ce. And i f there i s a n object created a�erwa rds , i t checks onl y i f the
s i ze of the object i s s ui ta bl e for the Eden s pa ce. If the s a i d object s eems ri ght, i t wi l l be pl a ced i n
the Eden s pa ce, a nd the new object goes on top. So, when new objects a re created, onl y the l a stl y
a dded object needs to be checked, whi ch a l l ows much fa ster memory a l l oca�ons . However, i t i s a
di fferent story i f we cons i der a mul �threa ded envi ronment. To s ave objects us ed by mul �pl e
threa ds i n the Eden s pa ce for Threa d-Safe, a n i nevi ta bl e l ock wi l l occur a nd the performa nce wi l l
drop due to the l ock-conten�on. TLABs i s the s ol u�on to thi s probl em i n HotSpot VM. Thi s a l l ows
ea ch threa d to have a s ma l l por�on of i ts Eden s pa ce that corres ponds to i ts own s ha re. As ea ch
threa d ca n onl y a cces s to thei r own TLAB, even the bump-the-poi nter techni que wi l l a l l ow memory
a l l oca�ons wi thout a l ock.

Thi s ha s been a qui ck overvi ew of the GC i n the young genera�on. You do not neces s a ri l y have to
remember the two techni ques that I have just men�oned. You wi l l not go to ja i l for not knowi ng
them. But pl ea s e remember that a�er the objects a re first created i n the Eden s pa ce, a nd the
l ong-s urvi vi ng objects a re moved to the ol d genera�on through the Survi vor s pa ce.

GC for the Old Genera�on

The ol d genera�on ba s i ca l l y performs a GC when the data i s ful l . The execu�on procedure va ri es by
the GC type, s o i t woul d be ea s i er to understa nd i f you know di fferent types of GC.

Accordi ng to JDK 7, there a re 5 GC types .

1. Seri a l GC
2. Pa ra l l el GC
3. Pa ra l l el Ol d GC (Pa ra l l el Compa c�ng GC)
4. Concurrent Ma rk & Sweep GC (or "CMS")
5. Ga rba ge Fi rst (G1) GC

Among thes e, the serial GC must not be used on an opera�ng server. Thi s GC type wa s created when
there wa s onl y one CPU core on des ktop computers . Us i ng thi s s eri a l GC wi l l drop the a ppl i ca�on
performa nce s i gni fica ntl y.

Now l et's l ea rn a bout ea ch GC type.

Serial GC (-XX:+UseSerialGC)

The GC i n the young genera�on us es the type we expl a i ned i n the previ ous pa ra gra ph. The GC i n
the ol d genera�on us es a n a l gori thm ca l l ed "mark-sweep-compact."

1. The first step of thi s a l gori thm i s to ma rk the s urvi vi ng objects i n the ol d genera�on.
2. Then, i t checks the hea p from the front a nd l eaves onl y the s urvi vi ng ones behi nd (s weep).
3. In the l a st step, i t fil l s up the hea p from the front wi th the objects s o that the objects a re pi l ed
up cons ecu�vel y, a nd di vi des the hea p i nto two pa rts : one wi th objects a nd one wi thout
objects (compa ct).
The s eri a l GC i s s ui ta bl e for a s ma l l memory a nd a s ma l l number of CPU cores .

Parallel GC (-XX:+UseParallelGC)

Figure 4: Difference between the Serial GC and Parallel GC.

From the pi cture, you ca n ea s i l y s ee the di fference between the s eri a l GC a nd pa ra l l el GC. Whi l e
the s eri a l GC us es onl y one threa d to proces s a GC, the pa ra l l el GC us es s evera l threa ds to proces s
a GC, a nd therefore, fa ster. Thi s GC i s us eful when there i s enough memory a nd a l a rge number of
cores . It i s a l s o ca l l ed the "throughput GC."

Parallel Old GC(-XX:+UseParallelOldGC)

Pa ra l l el Ol d GC wa s s upported s i nce JDK 5 update. Compa red to the pa ra l l el GC, the onl y di fference
i s the GC a l gori thm for the ol d genera�on. It goes through three steps : mark – summary – compac�on.
The s umma ry step i den�fies the s urvi vi ng objects s epa ratel y for the a rea s that the GC have
previ ous l y performed, a nd thus di fferent from the s weep step of the ma rk-s weep-compa ct
a l gori thm. It goes through a l i �l e more compl i cated steps .

CMS GC (-XX:+UseConcMarkSweepGC)
Figure 5: Serial GC & CMS GC.

As you ca n s ee from the pi cture, the Concurrent Ma rk-Sweep GC i s much more compl i cated tha n a ny
other GC types that I have expl a i ned s o fa r. The ea rl y ini�al mark step i s s i mpl e. The s urvi vi ng
objects a mong the objects the cl os est to the cl a s s l oa der a re s ea rched. So, the pa us i ng �me i s very
s hort. In the concurrent mark step, the objects referenced by the s urvi vi ng objects that have just been
confirmed a re tra cked a nd checked. The di fference of thi s step i s that i t proceeds whi l e other
threa ds a re proces s ed at the s a me �me. In the remark step, the objects that were newl y a dded or
stopped bei ng referenced i n the concurrent ma rk step a re checked. La stl y, i n the concurrent sweep
step, the ga rba ge col l ec�on procedure ta kes pl a ce. The ga rba ge col l ec�on i s ca rri ed out whi l e
other threa ds a re s�l l bei ng proces s ed. Si nce thi s GC type i s performed i n thi s ma nner, the pa us i ng
�me for GC i s very s hort. The CMS GC i s a l s o ca l l ed the l ow l atency GC, a nd i s used when the response
�me from all applica�ons is crucial.

Whi l e thi s GC type ha s the a dva nta ge of s hort stop-the-worl d �me, i t a l s o ha s the fol l owi ng
di s a dva nta ges .

It us es more memory a nd CPU tha n other GC types .


The compa c�on step i s not provi ded by defa ul t.

You need to ca reful l y revi ew before us i ng thi s type. Al s o, i f the compa c�on ta s k needs to be ca rri ed
out beca us e of the ma ny memory fra gments , the stop-the-worl d �me ca n be l onger tha n a ny other
GC types . You need to check how o�en a nd how l ong the compa c�on ta s k i s ca rri ed out.
G1 GC

Fi na l l y, l et's l ea rn a bout the ga rba ge first (G1) GC.

Figure 6: Layout of G1 GC.

If you wa nt to understa nd G1 GC, forget everythi ng you know a bout the young genera�on a nd the
ol d genera�on. As you ca n s ee i n the pi cture, one object i s a l l ocated to ea ch gri d, a nd then a GC i s
executed. Then, once one a rea i s ful l , the objects a re a l l ocated to a nother a rea , a nd then a GC i s
executed. The steps where the data moves from the three s pa ces of the young genera�on to the ol d
genera�on ca nnot be found i n thi s GC type. Thi s type wa s created to repl a ce the CMS GC, whi ch ha s
ca us es a l ot of i s s ues a nd compl a i nts i n the l ong term.

The bi ggest a dva nta ge of the G1 GC i s i ts performance. It i s fa ster tha n a ny other GC types that we
have di s cus s ed s o fa r. But i n JDK 6, thi s i s ca l l ed a n early access a nd ca n be us ed onl y for a test. It i s
offici a l l y i ncl uded i n JDK 7. In my pers ona l opi ni on, we need to go through a l ong test peri od (at
l ea st 1 yea r) before NHN ca n us e JDK7 i n a ctua l s ervi ces , s o you proba bl y s houl d wa i t a whi l e. Al s o,
I hea rd a few �mes that a JVM cra s h occurred a�er a ppl yi ng the G1 i n JDK 6. Pl ea s e wa i t un�l i t i s
more sta bl e.

I wi l l ta l k a bout the GC tuning i n the next i s s ue, but I woul d l i ke to a s k you one thi ng i n a dva nce. If
the s i ze a nd the type of a l l objects created i n the a ppl i ca�on a re i den�ca l , a l l the GC op�ons for
WAS us ed i n our compa ny ca n be the s a me. But the s i ze a nd the l i fes pa n of the objects created by
WAS va ry dependi ng on the s ervi ce, a nd the type of equi pment va ri es a s wel l . In other words , just
beca us e a certa i n s ervi ce us es the GC op�on "A," i t does not mea n that the s a me op�on wi l l bri ng
the best res ul ts for a di fferent s ervi ce. It i s neces s a ry to find the best va l ues for the WAS threa ds ,
WAS i nsta nces for ea ch equi pment a nd ea ch GC op�on by consta nt tuni ng a nd moni tori ng. Thi s di d
not come from my pers ona l experi ence, but from the di s cus s i on of the engi neers ma ki ng Ora cl e
JVM for Java One 2010.

In thi s i s s ue, we have onl y gl a nced at the GC for Java . Pl ea s e l ook forwa rd to our next i s s ue, where
I wi l l ta l k a bout how to monitor the Java GC status and tune GC.

I woul d l i ke to note that I referred to a new book rel ea s ed i n December 2011 ca l l ed "Java
Performance" (Ama zon, i t ca n a l s o be vi ewed from s afa ri onl i ne, i f the compa ny provi des a n
a ccount), a s wel l a s “Memory Management in the Java HotSpotTM Virtual Machine,” a whi te pa per
provi ded by the Ora cl e webs i te. (The book i s di fferent from "Java Performance Tuning.")

By Sa ngmi n Lee, Seni or Engi neer at Performa nce Engi neeri ng La b, NHN Corpora�on.

See also

The Principles of Java Applica�on Performance Tuning


Dev Pla�orm Thi s i s the fi�h a r�cl e i n the s eri es of "Become a Java GC Expert". In the first i s s ue
Understa ndi ng Java Ga rba ge Col ...
3 years ago by Se Hoon Park 0 109553

MaxClients in Apache and its effect on Tomcat during Full GC


Dev Pla�orm Thi s i s the fourth a r�cl e i n the s eri es of "Become a Java GC Expert". In the first i s s ue
Understa ndi ng Java Ga rba ge Col l ect...
4 years ago by Dongsoon Choi 0 48747

How to Tune Java Garbage Collec�on


Dev Pla�orm Thi s i s the thi rd a r�cl e i n the s eri es of "Become a Java GC Expert". In the first i s s ue
Understa ndi ng Java Ga rba ge Col l ec�...
4 years ago by Sangmin Lee 0 229488
How to Monitor Java Garbage Collec�on
Dev Pla�orm Thi s i s the s econd a r�cl e i n the s eri es of "Become a Java GC Expert". In the first i s s ue
Understa ndi ng Java Ga rba ge Co...
4 years ago by Sangmin Lee 0 228734

How Statement Pooling in JDBC affects the Garbage Collec�on


Dev Pla�orm There a re va ri ous techni ques to i mprove the performa nce of your Java a ppl i ca�on. In
thi s a r�cl e I wi l l ta l k a bout Statement ...
4 years ago by Dongsun Choi 2 42264
32 Comments CUBRID Open Source Database Community 
1 Login

 Recommend 4 ⤤ Share Sort by Best

Join the discussion…

shiv kumarganesh • 3 y ears ago


Believ e m e it w a s a w esom e!! I n ev er kn ew so m u ch a bou t th e Ga r ba g e Collection !!
10 • Reply • Share ›

nikhil > shiv kumarganesh • 2 y ears ago


I believ e u bu t a n ko deki kh a bii sa ch n a h i h ota
• Reply • Share ›

GOD > nikhil • a month ago


Too ch ootiy a h a i!
• Reply • Share ›

sanooj • 3 y ears ago


v er y g ood a r ticle a bou t GC.
5 • Reply • Share ›

Esen Sagynov Mod • 4 y ears ago


I fou n d a Slidesh a r e pr esen ta tion on th e sa m e topic by Gil Ten e - "Un der sta n din g Ja v a
Ga r ba g e Collection a n d Wh a t You Ca n Do A bou t It". You ca n fin d it h er e
h ttp://w w w .slidesh a r e.n et/ja x c... .
4 • Reply • Share ›

Srini • 2 y ears ago


A ll I ca n sa y "Th is is th e w a y y ou sh ou ld ex pla in " Gr ea t a r ticle!
3 • Reply • Share ›

kishore • 2 y ears ago


v er y g ood a r tile on g c..................................................
1 • Reply • Share ›

Prabu Selv araj an • 2 y ears ago


Rea lly a ex cellen t a r ticle a bou t th e Ga r ba g e Collection . I w a s g oin g th r ou g h m a n y
a r ticles, bu t fin a lly a ll m y qu estion s a r e a n sw er ed in y ou r a r ticle. Th a n ks a lot :-)
1 • Reply • Share ›

Binh Thanh Nguyen • 2 y ears ago


Th a n ks, n ice post
About CUBRID | Contact us |

© 2012 CUBRID.org. All rights reserved.

You might also like