You are on page 1of 4

A Brief Review of Code Profiling Techniques

Hashmat Ali, Graduation Student Saadat Abbasi, Graduation Student


MScs Student at Iqra University, Islamabad Pakistan MScs Student at Iqra University, Islamabad Pakistan
Email: hashmat32726@iqraisb.edu.pk Email: saadat31750@iqraisb.edu.pk
“This work was supported by Dr. Bilal Bashir (Senior fecality Member at Iqra university.”

ABSTRACT In this paper, we are going to present different profiling techniques that are used to overcome
the services delay issue in large applications. As Big data and cloud computing field are growing on and data
is available in huge amount the system needs high power to process this data, which leads to delay and latency
in services.to tackle these issues a system need to keep record of memory allocation and code optimization
and for this very purpose Profilers are used by the developers and systems in order to enhance the performance
of certain large applications. They facilitate the developers to concentrate on the code area which consume
more processor time. As foe as interactive applications are concerned such profilers are convenient as their
output does not depend on the response time of a particular program. They are also meant to keep in view
certain events related to the users. In this way they enable the developers to highlight the problems related to
service latency.

KEYWORDS Profiler, Latency, Garbage Collector, Memory allocation, Object, Instrumentation

I. INTRODUCTION latency in system interaction caused decline in work quality


of human beings as their energy concentration and zeal for
High level languages are more human readable and easy to work is being adversely affected by such delays, leaving
code to but we are not able to predict their performance like them angry and frustrated. So the main objective of our
programmers maybe using the language not properly which profiler is to make the developers aware of such
leads to run their program slower than expected and put more complications in order to facilitate the user more effectively.
load on processors.[3]Also, To save cost a datacenters profile [2]a technique called feature specific profiling that can report
guided optimization has a huge potential. Hardware base the cost of a code with respect to language structures.it will
monitoring units enables profiling with small amount of help programmer to find costly feature of the language and
overhead and proved to be effective, but these hardware explain prototype for two language and work as a
features are not flexible and also buggy with limitations. performance debugging tool.
Profiling based on instrumentation can replace hardware If a program run slow then programmers check for the time
through more flexible information gathering. cost of functions calls and statements and ranks all program
[4] Service like credit-card fraud detection and target website cost to identify the statement which are taking too long to
advertisement which rely on big data and these services are execute and replace these with proper statement to optimize
latency sensitive which run on memory managed runtimes the code we call this actionable because it point to redundant
like as the java virtual machine (JVM). However, as these code and can be optimized with changes to program.
platforms suffer from high pauses and latency due to poor Programmer already have wide range of tools to improve
memory management. performance of the program. Profilers are the most used tool
to diagnose performance issues. Most of the profiler report
II. Surveyed Techniques execution time, space and I/O requests and estimate cost on
bases of a program is static, per function definition or
[5]A latency profiling approach, with the help of a solid dynamic like per HTTP request. Some profilers are Vertical
execution program called LILA, is utilized, and validated, profiler which attempts to see through the use of high-level
spanning from minor benchmarks towards the complex language features, Singer and Kirkham’s profile, it assigns
applications related to the real world. Researches conducted costs to programmer annotated code regions, Listener
by various bodies have proved it significantly that there is a latency profiling, which reports high-latency operations and
strong connection between delayed responses of interactive many more.
systems and the perceptions of the users. Results showed that

VOLUME 1, 2020 1
FSP is a new technique which supplement cost with language profiling duration. The original binary runs natively without
specific features. This give a new perspective for the any instrumentation for normal execution phases. Instant
program performance which give developers ability to profiling also avoided possible latency degradation during
optimize their programs.it is useful for program which use the initial profiling phases by prepopulating the code cache.
non-local costs and also useful for languages that allow Instant profiling balances the tradeoff between knowledge
creation of new features such as c++, R. Feature specific and overheads. This balance can be controlled by two
profiling can be implemented on any language which support parameters. The first parameter, length of profiling,
stack annotations and inspection. To implement on determines how long one step of profiling lasts. Longer
languages which does not support stack annotations profile length provides more detail, but also higher
language must be extended by adding stack annotations. overheads. In addition, it is likely that end users may
experience transient latency loss during profiling processes.
It takes little effort and more steps to implement feature Therefore, we restrict the length of profiling to a maximum
specific profile then the traditional profilers. Library authors of a few milliseconds. The second parameter that affects
or developers must add support for their code. overhead profiling is the frequency of profiling.
Implementation is simple and just need few lines of code. It [6] A technique was introduced related to profiling to
cannot measure the language features that cannot show on understand the memory access behavior of programs which
stack annotations. are specialized in using recursive data structure. This
particular technique is known as recursive data structure
profiling, which aims at reducing memory access latencies in
[4] Latency sensitive platforms allocate objects next to each
order to fulfil the need for aggressive data structure
other with very different lifetimes which leads to severe
optimization with increase in processor memory
memory fragmentation. This problem identified previously performance gaps. This technique gives a more in-depth
but the current memory management methods are not suited view of a programmer’s memory accesses as it captures the
for large applications that holds massive amount of memory. aggregated information related to an entire data structure
Existing solution can reduce pauses by different methods, instance unlike other memory profiling techniques that only
but all these solutions require programmer effort and aggregate the behavior of individual loads and stores. It
knowledge, access of source code, offline profiling to reduce doesn’t require any high-level program representation for
latency. Pauses in long application are mainly caused by collecting RDS profile and manages this function with time
garbage collection algorithms due to copying object for overhead upon intensive and other benchmark suites. In
promotion and demotion. To reduce these copies, different order to quantify the potential of RDS profiler a metric has
lifespan object should be placed at different places in also been introduced which shows the stability of an RDS
memory by this we can reduce the fragmentation. To instance. The stability of RDS instance makes it attractive to
overcome these issues a new technique called Runtime data structure optimization as it does not undergo notable
object lifetime profiler (ROLP) is introduced that profiles recharges from the beginning towards the end. The technique
code of an application at runtime and helps garbage proposed in this particular paper is quite new and also termed
collection algorithms to place objects that has similar as shape profiling. It is quiet in finding the disjoint recursive
lifespans next to each other to reduce overall fragmentation data structure instances in a performance without requiring
and application latency. any type of information of art from any type of programmer.
[3]A technique called instant profiling was introduced which It also helped to identify several characteristics of RDS being
uses dynamic binary translation. In this technique instead of in benchmarks, which were not a job of other profiling
using instrumentation the entire execution, Instant profiling techniques.
regularly intersects the native execution and the [1] It is a usual practice for Microsoft profiling tools to
instrumented execution according to configurable sequence quantify average performance of a program. Such tools
length and frequency parameters. It further reduces the usually depend on a program’s control flow graph for
latency of the original profiling processes by pre-populating reporting results and their organizations. On the other hand,
the software code cache. performance predictability is also considered as significant
measure in interactive server application. In addition to that
A power full technique for runtime program examination and
particular measure it is also important to be kept in mind that
for collecting profile data is called Dynamic Binary
the concern of an end user is with that of the performance of
Instrumentation. There are several dynamic binary
a semantically defined interval of implementation
instrumentation systems which share similar internal
specifically in high level applications that are based on event
mechanisms. They intercept the execution of target
base programming. A traditional profiler gets the
programs, instrument points of interest, put instrument code
functionality prominent that is considered a difficult task that
in their software code cache, and execute it from the software
is present there on the crucial path of a semantic interval.
code. Where and what resources are described by users
Moreover, the conventional profilers are not that capable to
through custom APIs.
accommodate results for a semantic interval. They also look
The method operates by running instrumented profiling code
the ability of performance variance to individual functions.
from a machine data buffer for only a limited period of

VOLUME 1, 2020 4
To address all these issues a novel profiler termed as V execution time for various functions recorded by the
profiler, to point out the main sources of latency variance in software counters running on the host machine in the
a semantic interval of any software system. It also indicates profiling method by sampling the target processor's PC
the start and end of semantic intervals of interest provided a during program execution at a regular interval. Compared to
source code of a software system and annotations related to simulation, this approach is considerably fast, since the
programmer. A new abstraction which is called “variance executable runs in the real world. These profiler's best
tree” is used V profiler to evaluate the thread interleaving example is the GNU. But accuracy of this method is poor as
and breaks down the variances and co-variances of the instrumentation code is added which leads to overhead.
functions in the source code. From the end of the interval to 1.2 Sampling: Software-based sampling technique that
its start the latency variance is accumulated along a potentially reduces Runtime overhead compared to
backward path of dependence relationships among threads instrumentation code-based software profiling. Interrupt is
with minimum programming effort. In this way considerable developed in this technique at a regular interval. Or a
performance variance is being reduced. function is written which samples the contents of the
program counter and other essential registers of the
III. Discussion & Analysis processor are used to dynamically evaluate the execution
actions of the latter.
With great advancments in cloud computing and big data 1.3 Simulation: Simulators provide correct profiling
where systems has to process large amount of data and information and are nonintrusive as compared to other
handle services requests, with these advancement an issue is software-based techniques. The main advantage of
also arising that is the latency in code or delay in services simulation is the data can be tracked by designer from
due bad code or bad memory management. internal register of processor. Simulation is performed on
In table 1 we have shown a basic comparison of different host machine by instruction set simulation (ISS) model of the
profiling techniques that are available. target architecture.

Performance 2. Hardware based profiling


References

annotation

dependent

Overhead To overcome the limitations of software-based profiling


Language

Dynamic
Runtime

Stack

technique, scientist have to explore many hardware base


Medium

profiling methods e.g. JTAG interface, Trace/Debug


High
Low

interface and Logic analyzer.


Analyzer does not suit current complex SoCs which prohibit
direct access to the instruction bus of a processor. JTAG is
[1] X X X ✓ ●
useful for testing, authentication and debugging; however, it
[2] X ✓ X ✓ ●
is expensive to profile an application because it requires
[3] ✓ X X ✓ ● substantial overhead runtime and affects the application's
execution behavior. To profile the program code, tracked
[4] ✓ X ✓ ✓ ●
data is processed with host machine. But processing data in
[5] X ✓ ✓ X ● real time requires more efficient host CPU compared to
[6] ✓ X ✓ ✓ ● profiling the system.
Table 1
(Synthesis Matrix of Discussed techniques) There are several methods & techniques used in profiling or
used to create profilers.
There are two major classes available of profiling I. Instrumentation: Instrumentation means to instrument
1. Software based profiling the code related to the dispatching of event and of calls of
2. Hardware based profiling listeners. For this purpose, all call sites are instrumented, and
it is estimated that how much time is being consumed in
1. Software based profiling order to handle such events outside the java listeners. The
Software based profiling is the most widely used and insertion of instruments is aimed at calling a static method of
common profiling technique that is written in programming a class that is responsible for performing all actions. Manual
language. There are three types of software-based profiling. instrumentation of event dispatch, automatic instrumentation
I. Insertion of Instrumentation Code, II. Sampling, III. Cycle of listener calls, automatic instrumentation of paint calls and
accurate simulation automatic instrumentation of native calls are the activities to
1.1 Insertion of Instrumentation Code: Instrumentation be carried out during instrumentation.
done at the source code level, the assembly code level or the II. Execution: After running the application through the
binary level. This method modifies the initial code by adding process of instrumentation it has to be make sure that our
a certain code relevant to the profiling. This changed code profiler must be thread safe to keep track of Information
would be run on the target device or on the host computer. between profile notifications supported by java. Listener
Instrumentation code help to collect data of profiling. The

VOLUME 1, 2020 4
latency basically intends at identifying three type of calls and An other area that needs to be address is create neural
allowing the listeners to have high latency of higher ms. network based libraries that can be trained to help in increase
III. Dynamic Binary Instrumentation: This is a very the profilers capacity.
powerful technique for introspection of a runtime program
and for collecting profile data for profile guided V. References
optimization. There are many dynamic binary systems
available, they Intercept application execution and place [1] B. M. a. T. F. W. Jiamin Huang, "Statistical Analysis
instrumented code in software cache code and execute this of Latency Through Semantic Profiling," the Twelfth
code through software cache. Main benefit of this at runtime European Conference on Computer Systems, 2017.
is complete picture of program including plugin libraries and
[2] V. S.-A. J. V. a. M. F. Leif Andersen, "Feature-
dynamically generated code is available.
Specific Profiling," ACM Trans, 2019.
IV. Trace Analysis: Here a trace is converted into a
cumulative latency distribution and a listener latency profile. [3] T. M. R. H. D. B. a. S. M. H. K. Cho, "Instant
Latency of all invocations related are shown here. Trace is Profiling: Instrumentation Sampling for Profiling
being analyzed simply by subtracting the time stamp of the Datacenter Applications," IEEE/ACM International
listener. This is a straightforward and simple adoption of Symposium on Code Generation and Optimization,
traditional call profilers showing the exclusive and inclusive 2013.
time spend. [4] D. P. J. S. L. V. a. P. F. Rodrigo Bruno, "Runtime
Trace analysis runs in four perspectives. Object Lifetime Profiler for Latency Sensitive Big
1. Model phase injection Data Applications," the Fourteenth EuroSys
2. Cumulative latency distribution Conference.
3. Listener latency distribution [5] M. H. Milan Jovic ∗, "Listener latency profiling:
4. Listener latency profile Measuring the perceptible performance of interactive
V. Java GUI toolkits: SWING and SET are two main Java applications," Science of Computer
powerful java GUI toolkits, between whom the developers Programming, 2009.
are usually at a dilemma about which one is to be given
[6] E. R. a. D. I, "Recursive Data Structure Profiling,"
preference. In order to remove this discrepancy, and to make
workshop on Memory system performance.
these two GUI toolkits more useful, focuses on making
available a powerful latency approach. So, this particular
section aims at finding the differences between SWING and
SWT which may impact the said profiling approach in
different ways. It is important to identify the common aspects
that these two may share. Apart from commonalities there
are certain differences which are important to consider.

Most of the discussed techniques are beneficial and had a


great performance impact, also reduces programming
efforts, and save system resources. Can handle dynamic
workload however each one of these techniques somehow
has performance overhead which needs to be reduced as
latency intolerant system cannot bear that overhead.

IV. Conclusion and Future work


We have disscussed various profiling techniques in this
paper. Latecny and delay in services in big data and cloud
computing is the biggest challenge, all the techniques
disscussed in this paper has the ability to over come this issue
but these techniques also has limitation and need some
improvement. This paper can be helpful to scientist and
researcher who want to find a suitable profilling technique
for their research applications. This survey may also open
new directions for research as it discuss the profiling
techniques in details. An attemp is made through this survey
to disscuss the profiling techniques addressed by different
author in domain of code profiling in recent years.
Special attention should be paid to designing embedded OS-
based or multitasking profilers computer Software.

VOLUME 1, 2020 4

You might also like