You are on page 1of 2

Sample 1

The existing models for performance optimization used in compilers are limited in their ability to identify
profitable parallelism. Effective model-based heuristics and profitability estimates exist in order to
distinguish optimization. Empirical search on the set of valid possibilities to perform code motion, and
model-based mechanisms to perform tiling, vectorization and parallelization on the transformed program
are the main focus of developing automatic framework [1].
There are some automatic parallelization of programs that use pointer-based dynamic data structures,
written in Java. The approach exploits parallelism among methods by creating an asynchronous thread of
execution for each method invocation in a program [2].
A comparative study of prevailing tools showed that PLUTO is more efficient than the other
tools. Even though CETUS was efficient in terms of dependency analysis and parallel loops detection, it
shows error results in detecting and parallelizing nested loops. GASPARD shows the limit of the
modelto-source parallelize comparing to source-to-source parallelizer, thus is not flexible and applicable
for all the scenarios. But, it gave tolerable results for MM workload. As observed. One common limit of
those auto-parallelization tools is the generation of parallel openMP code which depends on the OpenMP
API, compiler and OS run time support to realize task partition. However, such support is rarely available
in an embedded context where OS is not always present [3]. For future work, an automatic accelerator
generation flow that integrates PLUTO and adapts an application targeting the general purpose processor
to an embedded environment seems much more favorable. [4]

Parallel computing has been developed day by day to achieve and improve the benefits of High
performance computing. From the hardware side, different multiprocessor designs have been introduced
for the betterment of the parallel computing. Future of parallel computing is not predicted since vast
research areas are going on.
Since the architecture is complex of machines, efficient programming has become a little bit difficult.
Some decisions are difficult or impossible to make at compile time. For example, to determine data
dependences exactly, the values of certain variables must be known. For deciding which one of two
nested parallel loops is better to move to the outermost position, the number of iterations of each loop is
usually needed.[5]
Effectiveness of traditional compilers is available on papers describe about the effectiveness of many
traditional techniques such as common subexpression elimination, code motion and dead code
elimination.[6]

References
[1] Louis-Noel Pouchet, Uday Bondhugula, Cedric Bastoul, Albert Cohen,Jagannathan Ramanujam and
Ponnuswamy Sadayappan, “Combined Iterative and Model-driven Optimization in an Automatic
Parallelization Framework,” Conference on Supercomputing (SC’10), New Orleans, LA, United States.
2010.

[2] Bryan Chan,” Run-Time Support for the Automatic Parallelization of Java Programs,” M.S. thesis,
Dept. Elect. Com. Eng, University of Toronto, 2002.
[3] G. Tian, G Hammami, O. “Performance measurements of synchronization mechanisms on 16PE
NOC based multi-core with dedicated synchronization and data NOC”. In: International Conference
on Electronics, Circuits, and Systems (ICECS’09), 2009, pp. 988 – 991

[4] Emna KALLEL, Yassine AOUDNI, Mohamed ABID,”“OpenMP” automatic parallelization tools:
An
Empirical comparative evaluation” in “IJCSI International Journal of Computer Science Issues”, 2013
[5] Rudolf Eigenmanny, David Padua : On the Automatic Program Parallelization, 1993
[6] N.Jones and S.Muchnick. Flow analysis and optimization of lisp-like structures. In Program Flow
Analysis, Theory and Applications, chapter 4, pages 102 – 131. Prentice- Hall , Englewood Cliffs, N.J.,
1981

You might also like