Java Performance Tuning 2504 | String (Computer Science) | Computer Programming

Java Performance Tuning

Atthakorn Chanthong

What is software tuning?

User Experience
The software has poor response time. I need it runs more faster

Software tuning is to make application runs faster .

Many people think Java application is slow. why? .

There are two major reasons .

The first is the Bottleneck .

The Bottleneck Increased Memory Use Lots of Casts Automatic memory management by Garbage Collector All Object are allocated on the Heap. Java application is not native .

The second is The Bad Coding Practice .

How to make it run faster? .

The bottleneck is unavoidable .

But the man could have a good coding practice .

A good design .

A good coding practice .

Java application normally run fast enough .

So the tuning game comes into play .

Knowing the strategy .

Tuning Strategy 1 Identify the main causes 2 Choose the quickest and easier one to fix 3 Fix it. and repeat again for other root cause .

Inside the strategy .

Tuning Strategy Profile. it’s better Identify the location of bottleneck Still bad? The result isn’t good enough Test and compare Before/after alteration Think a hypothesis Code alteration Create a test scenario . Measure Problem Priority Need more faster. repeat again Yes.

How to measure the software performance? .

We use the profiler .

Profiler is a programming tool that can track the performance of another computer program .

The two common usages of profiler are to analyze a software problem Profiler Profile application performance Monitor application memory usage .

How we get the profiler? .

Don’t pay for it! .

An opensource profiler is all around .

Some interesting opensource profilers .

Opensource Profiler

JConsole

http://java.sun.com/developer/technicalArticles/J2SE/jconsole.html

Opensource Profiler

Eclipse TPTP

http://www.eclipse.org/tptp/index.php

Opensource Profiler
NetBeans Built-in Profiler

http://profiler.netbeans.org/

java.net/ .Opensource Profiler This is pulled out from NetBeans to act as standalone profiler VisualVM https://visualvm.dev.

And much more … .

Opensource Profiler JRat Cougaar DrMem InfraRED Profiler4j Jmeasurement TIJMP .

We love opensource .

Make the brain smart with good code practice .

1st Rule Avoid Object-Creation .

Object-Creation causes problem Why? .

Avoid Object-Creation Lots of objects in memory means GC does lots of work Program is slow down when GC starts .

Avoid Object-Creation Creating object costs time and CPU effort for application .

Reuse objects where possible .

Vector) objects could be reused rather than created and thrown away .g.Pool Management Most container (e.

Pool Management VectorPoolManager V1 V3 V4 V5 getVector() returnVector() V2 .

} } . i < 10. i++) { Vector v = vectorPoolManager.Pool Management public void doSome() { for (int i=0. i++) { Vector v = new Vector() … do vector manipulation stuff } } public static VectorPoolManager vpl = new VectorPoolManager(25) public void doSome() { for (int i=0.getVector( ). … do vector manipulation stuff vectorPoolManager. i < 10.returnVector(v).

Canonicalizing Objects Replace multiple object by a single object or just a few .

.Canonicalizing Objects public class VectorPoolManager { private static final VectorPoolManager poolManager.... . private VectorPoolManager(int size) { . return pool[pool.length-1]. private Vector[] pool.. } public static Vector getVector() { if (poolManager== null) poolManager = new VectorPoolManager(20). } } Singleton Pattern .

Boolean(false). 4 objects in memory Boolean Boolean Boolean Boolean b1 b2 b3 b4 = = = = Boolean.FALSE Boolean.FALSE 2 objects in memory .TRUE Boolean.Canonicalizing Objects Boolean Boolean Boolean Boolean b1 b2 b3 b4 = = = = new new new new Boolean(true).FALSE Boolean. Boolean(false). Boolean(false).

No Cache String string = "55". Object Cached . Integer theInt = new Integer(string).Canonicalizing Objects String string = "55". Integer theInt = Integer.valueOf(string).

if (i >= -128 && i <= 127) { // must cache return IntegerCache.length.cache[i + offset].valueOf(…) . i < cache. } } public static Integer valueOf(int i) { final int offset = 128. static { for(int i = 0. } return new Integer(i). } Caching inside Integer. i++) cache[i] = new Integer(i .128).Canonicalizing Objects private static class IntegerCache { private IntegerCache(){} static final Integer cache[] = new Integer[-(-128) + 127 + 1].

Keyword. ‘final’ Use the final modifier on variable to create immutable internally accessible object .

. final Dimenstion height) { //Re-assign disallow width = new Dimension(5.. . . } public void doSome(final Dimension width. ‘final’ public void doSome(Dimension width..5)..5). } . Dimenstion height) { //Re-assign allow width = new Dimension(5.Keyword.

Auto-Boxing/Unboxing Use Auto-Boxing as need not as always .

Auto-Boxing/UnBoxing
Integer i = 0; //Counting by 10M while (i < 100000000) { i++; } Takes 2313 ms Why it takes 2313/125 =~ 20 times longer?

int p = 0; //Counting by 10M while (p < 100000000) { p++; } Takes 125 ms

Auto-Boxing/UnBoxing

Object-Creation made every time we wrap primitive by boxing

2nd Rule Knowing String Better

String is the Object mostly used in the application .

Overlook the String The software may have the poor performance .

.Compile-Time String Initialization Use the string concatenation (+) operator to create Strings at compile-time.

Compile-Time Initialization for (int i =0. } Takes 672 ms ." +" "+ "World"). i < loop. } Takes 16 ms for (int i =0." +" "+ "World". i++) { //Looping 10M rounds String x = new String("Hello" + ". i++) { //Looping 10M rounds String x = "Hello" + ". i < loop.

.Runtime String Initialization Use StringBuffers/StringBuilder to create Strings at runtime.

append(" ") . for (int i =0. x += ".append("Hello") .". x += name.append(". x += " Mr.Runtime String Initialization String name = "Smith". i < loop. } Takes 6187 ms . i++) { //Looping 1M rounds String x = "Hello".").". } Takes 10298 ms String name = "Smith". for (int i =0. i++) { //Looping 1M rounds String x = (new StringBuffer()).toString().append(name). i < loop.

String comparison Use appropriate method to compare the String .

i++) { //10m loops if (a != null && a.To Test String is Empty for (int i =0. i++) { //10m loops if (a != null && a.length() == 0) { } } Takes 31 ms . i < loop. Takes 125 ms for (int i =0. i < loop.equals("")) { } }.

If two strings have the same length String a = “abc” String b = “cdf” for (int i =0. i < loop. i < loop.equalsIgnoreCase(b)) { } } Takes 750 ms String a = “abc” String b = “cdf” for (int i =0.equals(b)) { } } Takes 125 ms . i++) { if (a. i++) { if (a.

If two strings have different length String a = “abc” String b = “cdfg” for (int i =0. i++) { if (a. i++) { if (a.equalsIgnoreCase(b)) { } } Takes 780 ms String a = “abc” String b = “cdfg” for (int i =0.equals(b)) { } } Takes 858 ms . i < loop. i < loop.

equalsIgnoreCase() does only 2 steps It checks for identity and then for Strings being the same size .String.

Intern String To compare String by identity .

Intern String Normally. string can be created by two ways .

”).”. . By String Literals String s = “This is a string literal.Intern String By new String(…) String s = new String(“This is a string literal.

.Intern String Create Strings by new String(…) JVM always allocate a new memory address for each new String created even if they are the same.

”).” The different memory address b “This is a string literal. a “This is a string literal.Intern String String a = new String(“This is a string literal.” . String b = new String(“This is a string literal.”).

Intern String Create Strings by Literals Strings will be stored in Pool Double create Strings by laterals They will share as a unique instances .

a Same memory address “This is a string literal.Intern String String a = “This is a string literal.”.”. String b = “This is a string literal.” b .

intern() method . By using String.Intern String We can point two Stings variable to the same address if they are the same values.

a Same memory address “This is a string literal.intern().”).intern().” b .Intern String String a = new String(“This is a string literal.”). String b = new String(“This is a string literal.

Intern String The idea is … Intern String could be used to compare String by identity .

Intern String What “compare by identity” means? .

Intern String If (a == b) Identity comparison (by reference) If (a.equals(b)) Value comparison .

Intern String By using reference so identity comparison is fast .

Intern String In traditionally style String must be compare by equals() to avoid the negative result .

.Intern String But Intern String… If Strings have different value they also have different address. If Strings have same value they also have the same address.

Intern String So we can say that (a == b) is equivalent to (a.equals(b)) .

intern() They are pointed to the same address with the same value . String c = new String("abc").Intern String For these string variables String a = "abc". String b = "abc".

i < loop. i < loop.Intern String for (int i =0. i++) { if (a. i++) { if (a == b) { } } Takes 32 ms .equals(b)) { } } Takes 312 ms for (int i =0.

use it carefully . Intern String is good Unfortunately. it makes code hard understand.Intern String Wow.

Intern String String.intern() comes with overhead as there is a step to cache Use Intern String if they are planed to compare two or more times .

char array instead of String Avoid doing some stuffs by String object itself for optimal performance .

char y[] = x. i++) { if ( (20 < y. i < loop.toCharArray(). for (int i =0.char array String x = "abcdefghijklmn". for (int i =0.charAt(5) == 'x') { } } Takes 281 ms String x = "abcdefghijklmn". i < loop.length && 20 >= 0) && y[20] == 'x') { } Takes 156 ms } . i++) { if (x.

3rd Rule Exception and Cast .

Stop exception to be thrown if it is possible Exception is really expensively to execute .

i < loop. } } Takes 16 ms .hashCode().Avoid Exception Object obj = null. for (int i =0. i < loop. for (int i =0.hashCode(). } catch (Exception e) {} } Takes 18563 ms Object obj = null. i++) { if (obj != null) { obj. i++) { try { obj.

Cast as Less We can reduce runtime cost by grouping cast object which is several used .

} } for (int i =0. float f = icast. } } Takes 31 ms Takes 16 ms .doubleValue().byteValue(). i < loop. for (int i =0. double d = ((Integer) obj). float f = ((Integer) obj).floatValue(). double d = icast. i++) { if (obj instanceof Integer) { Integer icast = (Integer)obj. byte x = icast.floatValue(). i < loop. Object obj = (Object)io. i++) { if (obj instanceof Integer) { byte x = ((Integer) obj).byteValue().doubleValue().Cast as Less Integer io = new Integer(0).

4th Rule The Rhythm of Motion .

Loop Optimization There are several ways to make a faster loop .

Don’t terminate loop with method calls .

for (int i = 0. j < x. for (int i = 0.length.length. i++) { for (int j = 0. int length = x. i++) { for (int j = 0.length. j < length.Eliminate Method Call byte x[] = new byte[loop]. i < length. j++) { } } Takes 62 ms Takes 109 ms . i < x. j++) { } } byte x[] = new byte[loop].

Method Call generates some overhead in Object Oriented Paradigm .

Use int to iterate over loop .

j < length. i < length. j < length. i < length.Iterate over loop by int for (int i = 0. j++) { } } Takes 125 ms . i++) { for (short j = 0. j++) { } } Takes 62 ms for (short i = 0. i++) { for (int j = 0.

char .VM is optimized to use int for loop iteration not by byte. short.

arraycopy(…) for copying object instead of running over loop .Use System.

) for (int i = 0. x. 0. 0. } Takes 62 ms System.System.length). Takes 16 ms . i++) { x[i] = y[i]. y.arraycopy(x. i < length.arraycopy(….

arraycopy() is native function It is efficiently to use .System.

Terminate loop by primitive use not by function or variable .

j++) { } } for(int i = countArr. j < countArr. i < countArr. i--) { for(int j = countArr.length-1. i++) { for(int j = 0. i >= 0.length-1. j >= 0. j--) { } } Takes 298 ms Takes 424 ms .length.Terminate Loop by Primitive for(int i = 0.length.

Primitive comparison is more efficient than function or variable comparison .

if-else is about equally in random case .The average time of switch vs.

. default: break. i < loop. case 8: break. If-else for(int i = 0...Switch vs. } } . . case 7: break. i++) { if (i%10== 0) { } else if (i%10 == 1) { . i++) { switch (i%10) { case 0: break.. } else if (i%10 == 8) { } else if (i%10 == 9) { } } Takes 2623 ms Takes 2608 ms for(int i = 0. case 1: break. i < loop.

Switch is quite fast if the case falls into the middle but slower than if-else in case of falling at the beginning or default case ** Test against a contiguous range of case values eg.. ..2. 1.4.3.

Recursive Algorithm Recursive function is easy to read but it costs for each recursion .

.Tail Recursion A recursive function for which each recursive call to itself is a reduction of the original call.

long result) { if (n == 2) return 2L*result. else return n*factorial1(n-1). } public static long factorial1b(int n. else return factorial1b(n. } Takes 125 ms . else return factorial1b(n-1. Tail-Recursive public static long factorial1(int n) { if (n < 2) return 1L.Recursive vs. result*n). 1L). } Takes 172 ms public static long factorial1a(int n) { if (n < 2) return 1L.

Dynamic Cached Recursive Do cache to gain more speed .

else return n*factorial1(n-1). public static final long[ ] factorial3Cache = new long[CACHE_SIZE]. public static long factorial3(int n) { if (n < 2) return 1L. return factorial3Cache[n].Dynamic-Cached Recursive public static long factorial1(int n) { if (n < 2) return 1L. else if (n < CACHE_SIZE) { if (factorial3Cache[n] == 0) factorial3Cache[n] = n*factorial3(n-1). } Takes 94 ms . } Takes 172 ms public static final int CACHE_SIZE = 15. } else return n*factorial3(n-1).

Recursion Summary Dynamic-Cached Tail Recursive is better than Tail Recursive is better than Recursive .

5th Rule Use Appropriate Collection .

LinkedList .Accession ArrayList vs.

i < loop. } Takes 5828 ms . for (int i =0.Random Access ArrayList al = new ArrayList().get(i). i < loop. } Takes 281 ms LinkedList ll = new LinkedList().get(i). i++) { al. i++) { ll. for (int i =0.

for (Iterator i = al. } Takes 1047 ms .next(). i.hasNext(). } Takes 1375 ms LinkedList ll = new LinkedList().iterator(). for (Iterator i = ll.) { i. i.iterator().) { i.next().hasNext().Sequential Access ArrayList al = new ArrayList().

ArrayList is good for random access LinkedList is good for sequential access .

Random vs. Sequential Access
ArrayList al = new ArrayList(); for (int i =0; i < loop; i++) { al.get(i); } Takes 281 ms

LinkedList ll = new LinkedList(); for (Iterator i = ll.iterator(); i.hasNext();) { i.next(); } Takes 1047 ms

Random Access is better than Sequential Access

Insertion ArrayList vs. LinkedList

i++) { al.valueOf(i)).add(0. } Takes 328 ms LinkedList ll = new LinkedList(). i < loop. Integer.add(0. } Takes 109 ms . i++) { ll.Insertion at zero index ArrayList al = new ArrayList(). i < loop. for (int i =0. for (int i =0. Integer.valueOf(i)).

LinkedList does insertion better than ArrayList .

Vector is likely to ArrayList but it is synchronized version .

ArrayList .Accession and Insertion Vector vs.

for (int i =0. i++) { al.get(i). i++) { vt.get(i). for (int i =0. } Takes 281 ms Vector vt = new Vector().Random Accession ArrayList al = new ArrayList(). i < loop. } Takes 422 ms . i < loop.

) { i.hasNext().iterator().next(). } Takes 1375 ms Vector vt = new Vector().hasNext().iterator().Sequential Accession ArrayList al = new ArrayList().next(). for (Iterator i = al. for (Iterator i = vt. i. } Takes 1890 ms . i.) { i.

add(0. i++) { vt. i < loop. i++) { al.add(0. } Takes 328 ms Vector vt = new Vector().valueOf(i)). i < loop. for (int i =1. for (int i =0.Insertion ArrayList al = new ArrayList(). Integer. } Takes 360 ms .valueOf(i)). Integer.

Vector is slower than ArrayList in every method Use Vector if only synchronize needed .

Summary Type ArrayList Random (get) 281 Sequential Insertion (Iterator) 1375 1047 1890 328 109 360 LinkedList 5828 Vector 422 .

Addition and Accession Hashtable vs HashMap .

Integer. Integer. i < loop.valueOf(i)). i++) { hm. i < loop.Addition Hashtable ht = new Hashtable().put(Integer. } Takes 453 ms HashMap hm = new HashMap(). for (int i =0.valueOf(i).put(Integer. } Takes 328 ms . for (int i =0.valueOf(i).valueOf(i)). i++) { ht.

i < loop.get(Integer.get(Integer. i++) { ht. i++) { hm.valueOf(i)). for (int i =0.Accession Hashtable ht = new Hashtable().valueOf(i)). } Takes 47 ms . i < loop. } Takes 94 ms HashMap hm = new HashMap(). for (int i =0.

Hashtable is synchronized so it is slower than HashMap .

Q&A .

com http://www.Reference O'Reilly Java Performance Tuning 2nd http://www.com/jperf/ .javaperformancetuning.glenmccl.

and Console Output Sorting Threading Tweak JVM and GC Strategy .Future Topic I/O Logging.

The End .

Sign up to vote on this title
UsefulNot useful