Professional Documents
Culture Documents
gperftools
Overview
gperftools is a collection of tools for CPU profiling, h
eap profiling, and heap checking,
along with a faster high-performance multi-threaded malloc implementation (thread
caching malloc).
These tools can be used to detect/locate memory leaks in C/C++ programs, figure out
what the program heap is at any given time, find places that do a lot of allocation,
analyze CPU profile, etc. Supported profiling output modes are both textual and
graphical (Directed Graph). Details of usage of these tools are described later in this
documentation.
2
Downloads
Compatibility
Installation
The compressed package can be downloaded from the official Downloads page on
Github as a . zip or .tar.gz file and extracted post completion of download.
Installation from terminal on linux can be done by executing the following commands
(on Debian).
3
NOTE: For a 64-bit system, it is recommended to install (if not already present) t he
latest version of libunwind before trying to configure or install g
perftools. F
ollowing
command can be used to directly install libunwind from the terminal.
CPU Profiling
A 3-step process is involved, from compiling the source file with proper options to
analyzing the profile. Considering a C source file named test.c whose CPU profile is to
be dumped to /tmp/test.prof, the following commands describe the steps. Read the
official documentation here.
Adding -lprofiler option to link time step installs CPU profiler into our executable. In the
second step, the filename for dumping profile is mentioned followed by our binary.
Third step mentions our binary followed by the filename of the dumped profile.
google-pprof opens in interactive mode.
For text data, enter text, and for graphical data, enter gv. For a detailed list of
commands, use help. Enter q
uit t o exit.
NOTE:
4
Heap Profiling
Heap profiling helps in figuring out what the program heap is at any given time and
finding places that do a lot of allocation. Similar to CPU profiling, a 3-step process is
involved, from compiling the source file with proper options to analyzing the profile.
Considering a C source file named test.c whose heap profile is to be dumped to
/tmp/test.hprof, the following commands describe the steps. Read the official
documentation here.
Adding -ltcmalloc o
ption to link time step installs heap profiler into our executable. In
the second step, the filename for dumping profile is mentioned followed by our binary.
Note that after the second step, profiles will be periodically dumped with the filenames
as: /tmp/test.hprof.0001.heap / tmp/test.hprof.0002.heap . . .
Any of these can be analyzed in the third step.
Third step mentions our binary followed by the filename of the dumped profile. In the
above example, 0001.heap i s analyzed. google-pprof o
pens in interactive mode.
5
For text data, enter text, and for graphical data, enter gv. For a detailed list of
commands, use help. Enter q
uit t o exit.
NOTE:
1. Here as well, re-compilation is not necessary for dumping profiles over and over.
2. A minimum allocation of 100 MB is necessary for generating heap profiles.
Heap Checking
It is useful for detecting/checking memory leaks. The first step is to install the heap
checker into the executable (similar to the above two cases). Considering a C source file
named test.c, the following commands describe the steps. Read the official
documentation here.
Adding -ltcmalloc o
ption to link time step installs heap checker into our executable.
Note that no dumping of any profile is done in this case. In the second step, the mode of
heap checking is mentioned followed by the executable.
The supported modes in the order of increasing strictness of memory leak checking are:
minimal
normal
strict
draconian
6
NOTE:
1. Heap checker records a stack trace for each allocation, which increases the
memory usage and slows down the program.
2. It internally uses heap profiler and hence, both heap checker and heap profiler
cannot be simultaneously run.
Thread-Caching Malloc
This is a faster implementation of the default dynamic memory allocation and
deallocation in C/C++ using malloc, new, free, etc. Also, in case of a multi-threaded
program, a thread-local cache is assigned to each thread from which smaller allocations
are satisfied.
Objects are moved from central data structures to local caches as and when necessary.
In order to use this in any C/C++ code, TCMalloc needs to be linked into that application
via -ltcmalloc flag. As seen above, TCMalloc includes the heap checker and heap
profiler as well.
7
Testing
For getting to know the working of these tools, small tests were done by us on some
small C/C++ code bases.
A simple program was written to find the transpose of a matrix. Allocation was done for
two (input and resultant) square matrices. The size of the matrix needs to be at least
3621 X 3621 in order to generate the profiling results because the minimum required
size is 100 MB (= 104,857,60 B) and for two matrices of mentioned size containing
integers (4 bytes), the size is 3
621 X 3621 X 4 X 2 bytes = 104,893,128 B.
For CPU Profiling, the text and GV output are given below.
8
9
There are two ways in which time spent inside a function (using knowledge of number
of profiling samples generated) is calculated. One is the time spent e
xclusively inside
that particular function (excluding the time spent in the functions called from it). The
other is to include the time spent in the functions called from within as well.
10
The directed graph shows not only the different number of profiling samples per
function but also the direction of function calls.
For heap profiling, any of the periodically dumped profiles can be analysed in t ext or gv
mode. For this test program, eight profiles are generated as mentioned below. The last
profile has zero bytes in use as deallocation is done in the end before the control leaves
main().
Since almost all of the allocation is done in main(), the following outputs are obtained
on analysing the first profile (0001).
11
For testing heap checking, a call to a function is made which just allocates some
memory for an integer and then returns without freeing it. This is caught by the heap
checker and the following information is displayed.
12
Note that along with detection of leak a command is provided to get the stack trace to
further investigate the origin of leak(s).
Criticisms
Although gperftools provides great tools for debugging and profiling, the following
points are worth noting.
13