You are on page 1of 8

International Journal of Research in Computer and

Communication Technology, Vol 2, Issue 10, October- 2013

ISSN (Online) 2278- 5841


ISSN (Print) 2320- 5156

Detection of Precise C/C++ Memory Leakage by diagnosing Heap dumps using


Inter procedural Flow Analysis statistics
S.Poornima
SR Engineering College
Warangal, India

Poornima.spa@gmail.com

Dr. C.V. Guru Rao

S.P. Anandaraj

Professor, Dept. of CSE


SR Engineering College
Warangal, India

Sr, Asst. Prof., Dept. of CSE


SR Engineering College
Warangal, India

guru_rao_cv@hotmail.com

Abstract-Memory leak is a time consuming bug often


created by C++ developers. Detection of memory
leaks is often tedious. Things get worst if the code is
not written by you, or if the code base is quite huge.
The most difficult coding bugs such as Memory
Corruption, reading uninitialized memory, using
freed memory, are challenging in recognizing and
fixing due to the delay and non-determinism linking
the error. Detecting memory leaks is challenging
because real-world applications are built on multiple
layers of software frameworks, making it difficult for
a developer to know whether observed references to
objects are legitimate or the cause of a leak. Our aim
is to build a fast and feature rich c Heap Analyzer
that helps the user in finding memory leaks and to
reduce memory consumption. By using the Heap
analyzer. Productive heap dumps with hundreds of
millions of objects and the retained sizes of objects
can be calculated quickly [2]. This analyzer also
prevents the garbage collector from collecting objects,
run a report to automatically extract heap leak
suspects. The Heap Analyzer allows the users in
finding the possible heap leak areas in various C/C++
applications through its Context flow analysis and
heap dump analysis. Our approach identifies not just
leaking candidates and their structure, but also
provides aggregate information about the access path
to the leaks
Keywords: Heap Dumps, Allocation/Deallocation,
Memory Leakage, Dynamic Linkage, Heap linking and
execution, Static and dynamic analysis

I. INTRODUCTION
Memory leaks are challenging to identify
and debug for several reasons. First, the observed
failure may be far removed from the error that
caused it, requiring the use of heap analysis tools
that examine the state of the reachability graph
when a failure occurred [4]. Second, real-world
applications usually make heavy use of several
layers of frameworks whose implementation details
are unknown to the developers debugging

www.ijrcct.org

anandsofttech@gmail.com

encountered memory leaks. Often, these developers


cannot distinguish whether an observed reference
chain is legitimate (such as when objects are kept
in a cache in anticipation of future uses), or
represents a leak. Third, the sheer size of the heap
large-scale server applications can easily contain
tens of millions of objects makes manual
inspection of even a small subset of objects
difficult or impossible.
Existing diagnosis tools are either online
or offline. Online tools monitor either the state of
the heap or accesses to objects in it, or both. They
analyze changes in the heap over time to detect
leak candidates, which are state objects that have
not been accessed for some time. Online tools are
not widely used in production environments, in part
because their overhead can make them too
expensive, but also because the need to debug
memory leaks often occurs unexpectedly after an
upgrade or change to a framework component, and
often when developers believe their code has been
sufficiently tested. Offline tools use heap
snapshots, often obtained post-mortem when the
system runs out of memory. These tools find leak
candidates by analyzing the relationships, types,
and sizes of objects and reference chains. Most
existing heuristics, however, are based solely on
the amount of memory an object retains and ignore
structural
information.
Where
structural
information is taken into account, it often relies on
prior knowledge of the application and its libraries
[10].
Languages
with
explicit
memory
management require the programmer to manually
deallocate memory blocks that are no longer
needed by the program, and memory leaks are a
common problem in code written in such
languages. This paper majorly contributions the
memory leaks via heap dumps and its
consequences. Heap areas are defined by objects,

Page 1041

International Journal of Research in Computer and


Communication Technology, Vol 2, Issue 10, October- 2013

arrays and classes [7].


During the program
execution, the Garbage Collector allocates areas of
storage in the heap, an objects continues to be
active while a reference to it exists somewhere in
the active state, therefore the object is reachable.
When an object creases to be references from the
active state, it becomes garbage and can be
reclaimed for reuse [11]. When this reclamation
occurs, the Garbage Collector must process a
possible finalizer and also ensure that any internal
Memory resources that are associated with the
object are returned to the pool of such resources.
At certain point of time, during accessing
the memory, the heap dump becomes the snapshot
of memory process.
The snapshot consists of
information about C++ objects and classes in
allocated heap, which consists of various types of
data of it, when trigging occurs [2] [6]. During the
given moment of a snapshot, a heap dump doesnt
have information about when and where an object
was allocated in the program (in which method).

program into memory, it is structured into three


types of memory, termed as segments: the text
segment, the stack segment, and the heap segment
[2]. The text segment represents machine language
of the program control lines are executed, which
contains all functions setting the user and system
defined program. The text segment is also termed
as code segment, because it contains the programs
compiled code residing in itself. The compiler
(gcc) generates an executable program, is
represented in the following structure of the
memory.
High address

Heap

The above criterias defined the requirements


of heap dumps allocation in memory.

If there is a system which is crashing


sporadically
with
an
OutOfMemoryError, then analyzing
an automatically written heap dump
can be a very easy way to find the
root cause of the problem
It helps in analyzing the footprint of
memory in user application and also
helps to find which are the biggest
structure, redundant data structures,
finds space wasted in unused
collections and more.
Most of the Leak Detecting
techniques depends on analysis of the
objects activities, such as allocation
and garbage collection, which are
complex in implementing using heap
dumps. This helps the user in finding
the reason behind too many garbage
objects produced during a certain
operation.
II. MEMORY ANALYZER

A user program is assigned for execution


and it is loaded into memory. While loading the

www.ijrcct.org

fd
Stack

A. Necessity of Heap Dumps:

ISSN (Online) 2278- 5841


ISSN (Print) 2320- 5156

Low address

Uninitialized
data
Initialized
data
Text

Initialized by
zero by exec
Read from
program file
by exec

Fig.1 Memory Organization of a Typical


Program
The figure.1 shows the memory organization of a
typical program, which consists of the following

Code executable or binary code residing in


Code Segment.
Data Segment is partitioned as initialized
data segment, which consists of all global,
static and constant data and Uninitialized
data segment, which consists of
uninitialized variables stored in BSS.
During program execution, calloc and
malloc functions allocates memory at
runtime, the structure is called as heap,
whenever the heap size has to be
increases, again calloc and malloc
functions are used.
Stack is used to store local variables
defined in the program and is used in
functions for passing argument along with
the return address of the instruction, which
is to be executed after the function call is
over.

Page 1042

International Journal of Research in Computer and


Communication Technology, Vol 2, Issue 10, October- 2013

At runtime, the processs virtual address space


is occupied by stack and heap at opposite ends.
Using setrlimit(RLIMIT_STACK), the stack size
can be automatically increased upto a size defined
by the kernel. The heap size can be increase by
invoking brk() or sbrk() system calls, this allows
the memory to map more segments or pages of
memory into processs virtual address space.
Actually, the stack and heap implementation
usually increases the runtime or operating system
clock rate. For example, the games and larger
applications , generates their own memory
allocations leads to performance critic, since they
occupy a bulk memory from heap and utilizes it
completely to avoid dependence on operating
system for memory allocation and execution.
A. Criterias of Creating and accessing Heap
dumps
The heap contains a linked list of used and
free blocks. New allocations on the heap (by new
or malloc) are satisfied by creating a suitable block
from one of the free blocks [14]. This requires
updating list of blocks on the heap. This metainformation about the blocks on the heap is also
stored on the heap often in a small area just in front
of every block 10.
The heap consists ADT of used and free
segments of a Linked List ADT. While creating a
heap, malloc function is used for new allocations
by constructing a user defined segments from
memory availability. Therefore, updations are
needed on the heap segment allocations and
deallocations.
The heap dumps consists of
metadata is stored on the allocated area of memory
before each block.
While structuring heap, the below considerations
are needed,

www.ijrcct.org

ISSN (Online) 2278- 5841


ISSN (Print) 2320- 5156

While starting the application, the heap


size is set, but it can be increases as space
is required by using allocator, it generates
memory from the OS.
Heaps are structures and stored in
Computer Static Memory.
Heap Variables must be deleted manually
to prevent its scope. The freeing of data is
done by using delete, delete[] or free
function.
Heap variable are time consuming
allocation process than the stack.
Block of data required for the user
programs are used on demand. The blocks
can be fragmented whenever large chunks
of allocations arises.
Heap structure is highly recommendable
when user does not know the size of data
needed at runtime.

For example, consider an following example code


on heap ADT:
#include<stdio.h>
Int x; /*static storage
Void main()
{
Int y; /* dynamic storage */
Char *str;
Str=malloc(100); /* allocates 100 bytes of
dynamic Heap storage*/
y=foo(23);
free(str);/* deallocates 100 bytes if dynamic
heap storage */
}
Int foo(int z)
{
Char ch[100]; /*ch is dynamic stack storage */
If(z==40)
foo(9);
return 3; /* z and ch are deallocated from stack
and 3 is pushed on stop of stack */

Page 1043

International Journal of Research in Computer and


Communication Technology, Vol 2, Issue 10, October- 2013

ISSN (Online) 2278- 5841


ISSN (Print) 2320- 5156

Fig.2 Representation of Compiling and Linking of heap dumps During Program Execution

1) Mismatched Allocation/Deallocation

In most of the C Applications, the system


performance depends upon the memory
consumption considerably. Memory Leaks are
considered as one of the most common problem
arises in memory, responsible for decrease in
performance. The C/C++ application has Garbage
Collection(GC), so memory leaks should not
occur,because GC cleans up unused objects which
are are not referenced any longer. But yet, the
objects which are not used are still referenced, GC
doesnt removes it, due to this memory leak
problem arises. Apart from memory leaks, there
are certain memory problems encountered in
fragmentation of memory, objects invocation, and
tuning. In such cases, these problems leads the
application to demolished with OutOfMemory
exception.
Consider the following example, the str variable is
allocated with 540 segments. It consumes large
chunk of memory,

char *str=(char*) malloc(540);


return;
In the above code, the character variable
str is declared and allocated, but it is not freed. This
kind of coding often leads to memory leaks and if it
occurs most often, it causes the application with out
of memory leading to premature termination, called
as crash.
B. Heap Dump Memory Analysis

www.ijrcct.org

The allocation/deallocation error occurs


when there is an attempt to deallocate a function
which is logically not allocated in the declaration
defined.
char *s =(char*) malloc(5);
delete s;
This error can be avoided by defining the
right deallocation. The new[] function used in C++
is used for allocating the memory and delete[]
function is used for freeing the memory.
2)Missing allocation
`
In a program, calling a memory which has
already been freed is called missing allocation
error, also named as repeated free error or double
free error.
char* pstr = (char*) malloc(20);
free(pstr);
free(pstr); * Results in Missing Allocation *
3)Uninitialized Memory Access
Whenever an uninitialized variable is read
again in the program, the error is called
uninitialized memory access
Char *str=(char*) malloc(512);
Char d=str[0];
Void val()
{

Page 1044

International Journal of Research in Computer and


Communication Technology, Vol 2, Issue 10, October- 2013

int p;
int q=p*4;

* uninitialized read of variable p *

This error can be avoided by using initialized


variables always.
4) Cross Stack Access
This error occurs, when a thread accesses
stack memory of a different thread.
Void main()
{
int *x;
CreateThread(., thread#1,.);
CreateThread(.,thread#2,.);
}
Thread #1
{
int x[1024];
x=y;
y[0] =1;
}
Thread #2
{
* Stack Crossed *
*x=2;
}
The stack crosses can be avoided by defining and
using global variables.
III.MEMORY LEAKS
In computer Science Application, the
problem occurs during the program execution,
when it tries to consume larger chunk of memory,
and is unable to release it is called Memory Leak,
commonly called as leakage. A Memory leak can
be normally detected and analyzed in the source
code by programmers with access.
The computer performance can be reduced by a
Memory Leak with more consumption of memory.
In worst cases, chunk of memory can be allocation,
due to that parts of system or resources stops
working properly, or it leads to application failures
or slows down the system due to thrashing [15].
The memory leak occurs due to dynamic allocation
of memory which are unreachable. To avoid this,
garbage collectors provides a solution, can be
integrated to any programming languages as builtin feature.

www.ijrcct.org

ISSN (Online) 2278- 5841


ISSN (Print) 2320- 5156

Example C code is represented to demonstrate a


memory leaks by losing the pointer to the allocated
memory.
#include <stdlib.h>
int main(void)
{
/* this is an infinite loop calling the malloc function
which
* allocates the memory but without saving the
address of the
* allocated place */
while (malloc(50)); /* malloc will return NULL
sooner or later, due to lack of memory */
return 0; /* free the allocated memory by operating
system itself after program exits */
}
The memory allocation function, malloc(), is called
inside the program loops, it fails when using it
without saving its address, when memory is
unavailable to the user program. It is because, the
allocation address is not stored in the memory, so it
cannot be free the prior allocated blocks. Consider,
the Operating Systems delays memory allocation,
until it utilizes.
IV. EXPERIMENTATION
The goal is to thoroughly revise the
performance of the user programs in the
programming environment. To attain sensible
solution, the garbage collector is compiled with
internal debugging mechanisms and evaluation
flags are designated to intimate the collector to
generate statistics about the program. By this
procedure, the solutions will be realistic and
required statistics will be produced.
The Compiler(namely GCC Version 4.1.0) is used
for compilation with the below criterias:
> home/gcc-repo/configure --prefix=home/gccprefix--enable-languages=c,c++,java
Searching for memory leaks with tcmalloc is very
simple you need to link program with this
library, and run it as in following example:
# HEAPCHECK=normal ./your-program
To compile and execute programs with a version of
GCJ that uses collector as the sole garbage
collector, we run the following commands.

Page 1045

International Journal of Research in Computer and


Communication Technology, Vol 2, Issue 10, October- 2013

#
LD_PRELOAD=/usr/lib/libtcmalloc.so.0.0.0
HEAPCHECK=normal ./your-program
When the program is linked and executed, it
generates a report about memory leaks identified.
The Command LD_PRELOAD links dynamically
the user program with libraries. This command
needs to be executed before compiling and
executing the user programs. For the purpose of
experimentation, the program is linked and
executed under various criterias.
While programs execution, the linked library used
the following environment variable, defined for
ensuring the memory levels:

HEAP_CHECK-REPORT true or false,


by default: true, defined to print the report
in the program
HEAP_CHECK_STRICT_CHECK true
or false, by default:true, it selects the

ISSN (Online) 2278- 5841


ISSN (Print) 2320- 5156

function used to check sameHeap or


NoLeaks.
HEAP_CHECK_IDENTIFY_LEAKS
true or false, by default:false, it gets the
addresses of leaked objects.
HEAP_CHECK_TEST_POINTER_ALIG
NMENT true or false, by default:false,
identifies the memory leaks due to nonaligned pointers.
PPROF_PATH used to specify path to
pprof utility.
HEAP_CHECK_DUMP_DIRECTORY
it specifies directory path, where
temporary files are created.

The sample output of user program is shown in the


below Fig.3, which shows the heap memory leaks
occurred in the given user program, with data
objects related to it.

Fig.3 Sample Generated Report for Identification of Heap memory Leak in user Program

The HEAPCHECK environment variable


sets level of checks, that will applied during
execution. This variable can has one of four values

www.ijrcct.org

minimal, normal, strict and draconian from


the simplest one to strictest, that could lead to slow
execution of program. Besides this, there are also

Page 1046

International Journal of Research in Computer and


Communication Technology, Vol 2, Issue 10, October- 2013

two additional modes: as-is when user can


specify which checks should be executed, and local
when checks are performed only for code, that
explicitly marked for checking (this is performed
by adding calls to GPT's functions to source code).
After the finding of memory leak (as in
our example above) library terminates program,
and prints call stack for functions, that lead to this
memory leak. In our example, memory leak is in
main function, at 106th line of code in file testhashes.cpp.
The heap checker automatically prints
basic leak info with stack traces of leaked objects'
allocation sites, as well as a pprof command line
that can be used to visualize the call-graph involved
in these allocations. The latter can be much more
useful for a human to see where/why the leaks
happened, especially if the leaks are numerous.
V.CONCLUSION
A novel memory leak detection algorithm
is represented based on solving Boolean flag
criterias. Performance and scalability is attained to
the expected levels by setting Boolean flags
definition to each function to generate a precise
statistics. The solutions shows that the systems
generates realistic and statistics information about
the heap memory leaks identified. The results
show that, memory can be used for on-in the fly
detection of memory leaks and memory corruption
during production runs.
Moreover, a new
methodology is presented, that uses proper memory
usage behaviour analysis for heap memory leakage
detection.
This work can be extended by
comparing with other tools and also investigation
can be done with more debugging problems in real
world applications that contain well documented
bugs.Future work will focus on optimizations to
reduce the run-time overhead.

ACKNOWLEDGEMENT
Authors would like to express sincere thanks to
Department of Science and Technology (New
Delhi) for their financial support to carry out this
work under project grant No. SR/WOS-A/ET24/2012.Further, their sincere feelings and

www.ijrcct.org

ISSN (Online) 2278- 5841


ISSN (Print) 2320- 5156

gratitude to Management and Principle of SR


Engineering College for their support and
encouragement to carry out the research work.

REFERENCES
[1]. D. L. Heine and M. S. Lam. A practical flowsensitive and context-sensitive C and C++
memory leak detector. In Proceedings of the
ACM SIGPLAN 2003 Conference on
Programming
Language
Design
and
Implementation (PLDI), pages 168181, Jun
2003.
[2]. P. Zhou, F. Qin, W. Liu, Y. Zhou, and J.
Torrellas. iWatcher: Efficient architecture
support
for
software
debugging.
In
Proceedings of the 31st International
Symposium on Computer Architecture (ISCA),
pages 224237, Jun 2004.
[3]. C++0x/C++11
Support
in
GCC.
http://gcc.gnu.org/onlinedocs/libstdc++/manual
/bk01pt04ch11.html (ref.19th May 2012).
[4]. David Drysdale. High-Quality Software
Engineering. Lulu.com, 2007. Mel Gorman.
Understanding the Linux Virtual Memory
Manager. Prentice Hall, 2004.
[5]. D.R. Chase, M. Wegman, and F. Zadeck.
Analysis of pointers and structures. In
SIGPLAN Conf. on Prog. Lang. Design and
Impl., pages 296{310, 1990.
[6]. N.D. Jones and S.S. Muchnick. Flow analysis
and optimization of Lisp-like structures.
in S.S. Muchnick and N.D. Jones, editors,
Program Flow Analysis: Theory and
Applications, chapter 4. Prentice-Hall,
Englewood Cli_s, NJ, 1981.
[7]. Abraham Silberschatz, Peter B. Galvin, and
Greg Gagne. Operating System Concepts.
Wiley, 2008.
[8]. Y. Xie and A. Chou. Path sensitive analysis
using boolean satis_ability. Technical report,
Stanford University, Nov. 2002.
[9]. D. Evans. Static detection of dynamic memory
errors. In Proceedings of the ACM SIGPLAN
1996 Conference on Programming Language
Design and Implementation, 1996.
[10].Y. Xie and A. Aiken. Scalable error detection
using boolean satis_ability. In Proceedings of
the 32nd Annual Symposium on Principles of
Programming Languages, Jan. 2005.
[11].D. Liang and M. Harrold. E_cient computation
of parameterized pointer information for
interprocedural analysis. In Proceedings of the
8th Static Analysis Symposium, 2001.
[12].T. Xie, S. Thummalapenta, D. Lo, and C. Liu.
Data Mining for Software Engineering. IEEE
Computer Vol. 42(8):3542, Aug 2009
[13].J. Engelfriet and G. Rozenberg. Graph
grammars based on node rewriting: an
introduction to nlc graph grammars. In Graph

Page 1047

International Journal of Research in Computer and


Communication Technology, Vol 2, Issue 10, October- 2013

ISSN (Online) 2278- 5841


ISSN (Print) 2320- 5156

grammars and their application to computer


science: 4th Intl. Workshop pages 1223, 1991
[14].N. Nethercote and J. Seward. Valgrind: a
framework for heavyweight dynamic binary
instrumentation. In PLDI 07 , pages 89100,
2007
[15].M. Jump and K. McKinley. Cork: dynamic
memory leak detection for garbage-collected
languages. In POPL 07 , pages 3138, 2007.
[16].S. Cherem, L. Princehouse, and R. Rugina.
Practical memory leak detection using guarded
value-flow analysis. In PLDI 07 , pages
480491, 2007

www.ijrcct.org

Page 1048

You might also like