Professional Documents
Culture Documents
Arwa Alrawais
College of Computer Engineering and Sciences
Prince Sattam Bin Abdulaziz University
Al-Kharj 11942, Saudi Arabia.
Email: a.alrawais@psau.edu.sa
Abstract—The increase demand for processing power has parallel programming paradigms. Finally, some of the parallel
grown over the years, this demand lend to the parallel approach programming application will be represented, such as OpenMP.
which means linking a bunch of computers together to jointly
increase both the speed and efficiency. The parallelism approach There is a miss conceptual among most of the programmers
plays a significant role in the new generation’s applications by that writing a parallel programming is hard, because of that
moving the technologies from expensive and specialized parallel most of the programmer prefer writing a sequential program
supercomputers to linking a set of computers. Throughout the than writing a parallel program. “The parallel programming
years, the parallel approach lends to parallel programming model exists as abstraction above hardware and memory
models which exists above hardware and memory architectures. It architecture [2]”. It attempts to express the parallel program,
is a collection of software technologies that present parallel algo- and exploit the parallelism to solve a problem. The models are
rithms and resembling applications with underlying system. This
paper describes the essential concept of parallel programming
different from each other, each can execute a different range of
and a brief overview of different areas of parallel programming problems, and each can be run in different architecture. There
models and paradigms. Furthermore, it implements and evaluates are two main approaches for parallel programming:
OpenMP parallel programming and illustrates its effectiveness.
• Implicit parallelism: when the programmer does not
Keywords—Parallel programming models; Parallel program- specify the parallelism, therefore cannot control com-
ming paradigms; Parallel algorithm; Parallel applications putation scheduling or the data placement.
• Explicit parallelism: The parallelism is explicitly spec-
I. I NTRODUCTION ified in the program code by the programmer using
some tools, such as library cells. This approach per-
In 1980’s, the computer at that time was thought, the best mits the user to assess how much parallelism can
computer in its performance by making the computer faster be exploited. The efficiency that is obtained by the
and adding more efficient processors. Throughout the years, explicitly parallelism is better than that is obtained by
the shifting to parallel processing that essentially attempt to the implicitly parallelism [1].
solve a computation problem made the computer performance
different than it was. In the earliest of 1990’s, the endeavor The rest of this paper is organized as follows. In section II,
in transition from expensive and massively parallel processors the paper summarize the related works to parallel models and
that contained some supercomputers toward the networks of paradigms. Section III introduces several programming models
computers was growth sharply. As a consequence, the rising and tools. An implementation and evaluation of OpenMP is
lend to the availability of the high performance computers conducted in Section VII. At last, a conclusion is drawn VIII.
such as PC’s, networks and workstation at the market. The
making of the cluster or network of computers, additionally, II. R ELATED W ORK
are being attractive for many people because the cost effective
of the parallel processing. At the same time of developing Over past few decades, several parallel programming mod-
the hardware demands, there was a progress in providing els and paradigms have been researched. In 1993, [3] Giloi,
programming resources to be attuned with diversity of com- W.K. formed the parallel programming domain in memory
puting environments. The programming development directed sharing model or message passing paradigms. He indicated
to numerous of parallel programming models which permits the advantage and disadvantages for each parallel program-
the expression of parallel to be executed [1]. ming models as well as its appropriate architecture. At that
time, the attention of most of the researchers and developers
The parallel programming models, generally, can be as- were toward more to the parallel programming models, as a
sessed by the range of the problems that can be exhibited and result there were many conferences and workshop held. The
how these models can be executed in several architectures. proceeding third working conference on massively parallel
In choosing a parallel programming model, considering how programming models, and the proceedings third international
much parallelism can be exploited is an important aspect. workshop on high-level parallel programming models and sup-
This paper will propose the parallel programming models portive environments HIP’98 were held for example in 1997
from various aspects. Furthermore, it will explain the different and 1998. Consequently, the mature improvement throughout
phases of designing parallel algorithm as well as different the years on the parallel programming research area lend
1022
Authorized licensed use limited to: Nitte Meenakshi Institute of Technology. Downloaded on October 27,2022 at 13:21:27 UTC from IEEE Xplore. Restrictions apply.
to the development and the appearance of numerous parallel their traditional languages such as, C language program and
programming models and paradigms. FORTRAN. Furthermore, the idea behind using the parallel
programming language is hard; make the programmers prefer
Over the years, there were many new parallel programming
using the sequential languages more than parallel languages.
models. In 2009 [4], MSI (Multi-thread Schedule Interface), a
new parallel programming model has been proposed to utilize
the power of multi-core and conquer the load balancing prob- C. High Performance FORTRAN
lem. Furthermore, the parallel programming models research In 1993, the first High performance FORTRAN version was
attempted to optimize the level of parallelism. For instance, the published to present effective use of the language on different
optimistic parallelism based speculative asynchronous message parallel systems. High performance FORTRAN is an extension
passing introduced in 2010, has exploited more parallelism of Fortran 90; it supports distributing computations of a single
and reduced the overhead of asynchronous during runtime by array over multiple processors by using data parallel model. It
speculating the execution of message passing in object oriented does an efficacious performance on applications that follows
program [5]. On the other hand, the expanding in parallel (single instruction multiple data), and (multiple instruction
programming models permits the models to be involved in stream multiple data stream) architectures. On the other hand,
several domains such as design patterns. According to [6] the High Performance Fortran needs to be addressed if it is
they provided a design patterns based parallel programming used with some applications that follow asynchronous struc-
model DPBPPM and system implementation in SMP platform ture.
environment. The DPBPPM has been proved to be flexible
and efficient system. There is a development and appreciated D. Message passing
effort on high performance computing research on advance
parallel programming models. For instance, the Center for Message passing has libraries which designed to be stan-
Programming Models for Scalable Parallel Computing is a dard for disturbed memory systems, additionally; it designed to
research center that concentrates on programming models for perform the sending, receiving and other operations of message
scalable parallel computing research area. The research center passing. The message passing offers a normal synchronization
is led by Argonne National Laboratory and other Universities, among processors. One of the beneficial ways of message
where they try to develop the current models and advance passing is performed in the debugging, because it does not
models. permit for overwriting memory, even if it is happened it can be
detected easily than shared memory [8]. It works efficiency and
III. PARALLEL P ROGRAMMING M ODELS AND T OOLS naturally in distributed memory systems. The mostly common
systems of message passing are Message Passing Interface and
In parallel programming models section, presents an ap- Virtual Parallel Machine.
proach of different models of parallel programming. This
section, it describes briefly the different parallel program- 1) Message Passing Interface: The Message Passing In-
ming models and approaches including Paralliezing Compilers, terface (MPI) where each processor permits to have an ac-
Parallel Language, High Performance FORTRAN, Message cess by its CPU to the local memory and these processes
Passing, Virtual Shared Memory, Data Parallel, Programming can communicate with each other by sending and receiving
Skelton and The Partition Global Address Space. messages [9]. Furthermore, the transformation of the data
between these processors is desired to have cooperative op-
A. Parallelizing Compilers erations that match the sending side with its receiving side.
Among concurrent processors the Message Passing Interface
Paralliezing compiler is a type of source to source compiler, (MPI) provides a communication between them, including
which its input and output are a high level language. It transfers point to point communication and copulative communications.
the program into parallel output version after analyzing the Moreover, the MPI implementation includes parallel machine,
sequential program and detecting which part of program can shared memory machine, distributed memory multiprocessor,
be parallel. The output program language is usually the same workstations cluster and heterogeneous network. MPI is used
as the input program language. The most common language by developer and users when the parallel system relies on
that is used with the paralliezing compiler is FORTRAN. There message passing, because its standardization, portability, avail-
are two significant reasons that made FORTRAN language is ability, performance, additionally, it is supportable by many
the best language that can be used with paralliezing compiler. HCP platforms.
First reason, The Fortran is been written in many scientific
computing algorithms. Seconds, analyzing and transforming 2) Parallel Virtual Machine: The Parallel Virtual Machine
the program is easier because FORTRAN has some properties (PVM) was developed and PVM3 was completed in 1993 [10].
that it makes paralliezing procedures easy such as static mem- Essentially, it is software that allows of using a collection
ory model [7]. The paralliezing compiler works well in some of the heterogeneous of UNIX and/ or Windows computers
applications on shared memory multiprocessors. Although, to be used as single distributed parallel processes. The PVM
it has some limitation on other applications that deals with system consists of two main parts which are PVM3 that found
distributed memory machines, due to non informality time of with the virtual machine on the computer, and a library of
access a memory [1]. PVM interface that includes a collection of tools to help the
coordination among tasks. PVM model provides the user the
B. Parallel Language environment of the heterogeneous with some features [10]:
Most of the people are not willing to use other languages • Explicit message passing model: each one performs a
which they are not familiar with; they would like to use computation in multitask set.
1023
Authorized licensed use limited to: Nitte Meenakshi Institute of Technology. Downloaded on October 27,2022 at 13:21:27 UTC from IEEE Xplore. Restrictions apply.
• Heterogeneity support: it is supported by PVM in their F. Data Parallel
network, applications, and machines.
Most of the parallel on data parallel model concentrate
• Multiprocessor support: it attempts to exploit the un- on using operations on data which being structured. On each
derlying hardware by using some methods on multi- structure, a group of tasks is being operated, and each task
processors. operates on distance partial of the data structure. The data
parallel model can be used on either SPMD multicomputer or
• Processes based communication: the parallelism item SIMD computers. In SIMD synchronization, all the processors
in PVM is a task that transports among computations are implemented in lockstep fashion. At the compiler time, the
and communications, where each task has to be iden- data parallel model’s synchronization can be performed. Using
tified by unique integer across the whole system. the data parallel model with SIMD computer makes debugging
• User configured host pool: is a set that could in- and writing the code simple, due to the explicitly of parallelism
clude single-processor and multi-processor computers. that is processed by the flow control, and synchronization
It subjects to adjust or delete during the operations. hardware. In general, the data parallel program is easy to
make visualization for the program behavior, additionally; it
The PVM computing model depends on the tasks that are has natural load balance [12]. Examples of data parallel are
being a content of the application, where each task performs image processing, N-body problem, and matrix operations.
a part of the computation. The functionality of these tasks is
various some of them need to be functional in parallel, and G. Programming Skelton
other need to be synchronize, and that specifies any task can
be start or stop and other task may be add or delete during the It is a collection of high level abstraction where most of
executions phase. the parallel paradigms are supported. The paradigm for the
programming has the same control structure, and at the same
time can solve different problems. The programming paradigm
E. Virtual Shared Memory contains useful data and communication patterns, additionally;
it contains an abstraction of the form of the programming
TThe traditional shred memory concept where the mem- or Skelton. A specific parallel programming paradigm is
ory is distributed among processors is different than the corresponded by Skelton which has a single abstraction that
virtual shared memory. The virtual shared memory is all encapsulates both the communication patterns and control.
the processors on the distributed memory machine as they Then, it identifies the parallel programming paradigms. Basi-
all share a single memory. The objective of designing the cally, Skelton is implemented on the top of shared memory or
virtual shared memory is reducing the required overhead of the distributed memory, message passing, and object oriented. As
communication of coherent access. The virtual shared memory a result, Skelton considered as general program that enhance
implantation can be accomplished at any level of hierarchical parallel [1].
computer level as shown in Figure 1 [11].
1024
Authorized licensed use limited to: Nitte Meenakshi Institute of Technology. Downloaded on October 27,2022 at 13:21:27 UTC from IEEE Xplore. Restrictions apply.
Fig. 2: Parallel Algorithm Design Stages.
Each process has a private and shared memory. For instance, C. Agglomeration
if processor has a local data, the private memory will be used
for the local data. Similar, if the processor has a global data, The communication structure and the tasks that made dur-
the shred memory will be used for the global data. With single ing the previous phases are evaluated. In order, for improving
address, the process can access directly any global data. PGAS performance or reducing the cost, the tasks are grouped into
programming languages consists of three main programs that large group.
are widely used: Co-Array FORTRAN (CAF), Unified Parallel
C (UPC), and Titanium [13]. D. Mapping
Mapping focuses on assigning each task to a processor,
IV. PARALLEL A LGORITHM D ESIGN in order to maximize the processes utilization and minimize
the communication’s cost. Mapping can be executed during
There are a verity ways of designing and building a paral- the compiler time (statically) or run time (dynamically) using
lel program. The design methodology that we will illustrate load-balancing methods.
is proposed by Ian Foster that permits the programmer to
concentrate on issues that related to machine independent,
such as concurrency in the early stage, and machine specific V. PARALLEL P ROGRAMMING PARADIGMS
aspects of design are remained until the end of the design When we classify different parallel applications into pro-
process [1]. The design methodology is identified by four gramming paradigms, we found a few frequently paradigms
main stages: agglomeration, communication, mapping and that are used in more than one applications. The parallel
partitioning. Both partitioning and mapping stages are attempts programming paradigms includes different algorithms which
to develop the scalability and concurrently algorithms, while based on the paradigms that are used. Selecting the appropriate
the agglomeration and mapping stages are concentrate more on paradigms to develop a parallel application is based on the
locality and other issues that related to performance as shown algorithms in different paradigms. Furthermore, the type of
in Figure 2. the availability of some recourse for parallel computing, and
the variety of parallel computers that inherits in the problem,
A. Communication are all helps to determine which parallel is required.
The computations and the data in which can be decom-
posed into a small tasks. The domain/data decomposition is A. Choice of Paradigms
a problem that related to the data decomposition, and the Most of the authors who, classified the parallel program
functional decomposition is the decomposition computation into different classes, did not have precisely the same approach.
into tasks. There are some issues in portioning step when the Although, the following paradigms are the most popular
number of computer’s process is ignored. paradigms that are used on the parallel programming: Task
Farming, Single Program Multiple Data, Divide and Conquer,
B. Partitioning Data Pipelining, Hybrid models, and Speculative Parallelism
[15].
The communication is desired the coordination between
the tasks that are obtained from the partioning phase. The B. Task Farming
communication pattern is determined in its phase. There are
four different communication patterns: static or dynamic, struc- The Task Farming paradigms contains two major processes.
tured or unstructured, local or global, and synchronous or First, the master process handles the decomposing of the
asynchronous [1], [14]. problem into several tasks, and then distributes these tasks into
1025
Authorized licensed use limited to: Nitte Meenakshi Institute of Technology. Downloaded on October 27,2022 at 13:21:27 UTC from IEEE Xplore. Restrictions apply.
a farm of slave process. Subsequently, after finishing the slave some parallelism, although the processors require few com-
process, the master assembles partial result in order to compute munication process. There is no required for communication
the final result. Second, the multiple slave process execute in a processes between sub processes, because the sub problems
small cycle: getting the message with the task, processing the are independent [1]. Basically, the divide and conquer can be
task and sending the result to the master. The communication performed into three main steps: compute, spilt and join.
in Task Farming usually occurs among the master process and
slave process [15]. F. Speculative Parallelism
This paradigm can be used in either static load-balancing If the parallelism is hard to be obtained through one of
or dynamic load-balancing. In the static Load-balancing, the the pervious paradigms, in this situation the Speculative Par-
distribution tasks appear in the first stag. As a result of tasks allelism is being used. Even though, there are some problems
distribution, the master can be contributed in the computation, cannot achieve the parallelism, due to the data dependencies
after reserving the slave. In the dynamic load-balancing, if the complex. The speculative parallelism uses some optimistic
numbers of tasks are larger than the number of processes, it operation to simplify the parallelism and executing the problem
can be more appropriate. The dynamic has a significant feature in small fractions. Some cases of using the Speculative Paral-
where the application can alter the system and recognize lelism, are discrete event simulation (asynchronous problem),
the resources of the system. In general, the Task Farming and using different algorithms for the same problem (the first
paradigms can deliver a high scalability and speedily level of one is the one provide the final solution) [1], [15].
computation [1].
G. Hybrid Models
C. Single Program Multiple Data
The Hybrid Models is called Mixed Mode Programming;
In the Single Program Multiple Data (SPMD), divides the it consists of more than one paradigm. The Hybrid Models is
data between the processors, and each processor runs a part used when the mix element of different paradigms is required.
from the total basic algorithm on the data. Finally, it will gather An example of Hybrid model is OpenMP with MPI, where the
the final result in the end. message passing with MPI is used for a communication, and
This kind of parallelism can be a geometric parallelism the OpenMP for each single node is used to control a thread.
or domain decomposition or data parallelism. The SPMD is
the most widespread paradigm is used. This paradigm can be VI. PARALLEL P ROGRAMMING
performed efficiently, if the data distributed equally among the There are many existing parallel programming. This section
processes and the system is homogenous [1], [15]. will discuss and compare between the widely known paral-
lel programming including OpenMP, and Threading Building
D. Data Pipelining Block in terms of many aspects.
In the data pipelining, is one of the most common de-
composition processing paradigm, due to its simplicity and A. OpenMP
robustness. It identifies the parallel tasks on the algorithm OpenMP, which stands for Open Multi-Processing, was
where each processor runs a part of the algorithm. The data developed in cooperative efforts of several software developers
flows in the pipeline from one processor which corresponds to and companies manufacturing, such as Oracle, IBM, Intel, and
a phase in the pipeline structure to another processor until the Hewlett-Packard. It is an Application Program Interface (API)
last processor that provides the output. Each process executes that supports multi-platform shared memory parallel program-
a fraction of the algorithm, and these processes communicate ming: C, C++ and FORTRAN [16]. OpenMP could be found
with each other through data flowing. Usually, the data pipeline in most of the mentioned language architectures including
paradigm is used on the image processing and data reduction UNIX platform, and Windows NT platform. The OpenMP
application [1], [15]. provides the shared-memory parallel programmers a simple
interface for developing parallel application, and that due to
E. Divide and Conquer OpenMP portability and scalability. The OpenMP portability
is a significant characteristic. It permits the compiler, that sup-
The divide and conquer is a top-down approach. The
portable by OpenMP, to be used in any parallel application that
solution of main problem where is in the top level (main
developed by OpenMP. For attaining the parallel performance,
problem) is obtained by combining the solutions of the sub
the compiled binary result must be executed on the hardware
problems in the down levels. It divides the main problem into
platform [17].
simple sub problems, and sometime decomposes these sub
problems into more simple problems if it possible, until each The following OpenMP is written on C++, and FORTRAN
sub problem has a simple solution. Then, it gathers all the sub [17]. In the example, the loop adds the Y array to the X array
problems’ results to obtain the solution for the main problem. where are executed in parallel.
Giving sufficient parallelism, the sub problems can be solved
concurrently. Another OpenMP example
This example demonstrates the using of Wordcount program
The structure of the divide and conquer algorithm is with OpenMP. The Wordcount is a program that counts the
perform as a tree where the main problem is the root and the number of occurrence unique word in each read file. After the
sub problems are the nodes. The dividing and combination of program reads most of the files, it will show the whole number
the sub problems to obtain the main problem formulate uses of the unique words and the most frequent word in the files. In
1026
Authorized licensed use limited to: Nitte Meenakshi Institute of Technology. Downloaded on October 27,2022 at 13:21:27 UTC from IEEE Xplore. Restrictions apply.
the file, each line can be split between several processors that
provide sufficient parallelism to speed up the word counting.
The processes’ results unite to deliver the final result.
The input program will be a file which could be quite large,
because of that the file will be parsed in parallel. In order, this
program needs to create an object of the following classes:
1027
Authorized licensed use limited to: Nitte Meenakshi Institute of Technology. Downloaded on October 27,2022 at 13:21:27 UTC from IEEE Xplore. Restrictions apply.
is supported by OpenMP, while TBB provides similar data
access locks called mutex. Finally, TBB does not require
any special language or compiler unlike OpenMP that uses
program directives for compiler [18], [19].
VIII. C ONCLUSIONS
the product of two input matrices. The algorithm operation In summary, the programmers are able to express the
carries out by three main nested loops, where two loops for parallelism aspect in their programs while concurrently exploit
initialization and the other one for multiplication. The matrix the underlying hardware architecture capabilities by using
multiplication algorithm was tested in dual-core Intel Core(R) different parallel programming models. Each parallel program-
N4000 CPU, 1.10 GHz system with RAM of size 4 GB. We ming model has different way of exploiting parallelism. Cur-
evaluate the OpenMP parallelism model in terms of speedup rently, the parallel programming models research area previews
and efficiency. As shown in Figure 5, the time decreeses as the advance development between whiles. We believe during a
number of threads grow. Similar, the efficiency of the model few years the parallel programming performance, will be ten
rises with the increasing number of threads in Figure 6. times than it is including the current and new models. This
paper describes briefly diverse parallel programming models
including Paralliezing Compilers, Parallel Language, High
Performance FORTRAN, Message Passing, Virtual Shared
Memory, Data Parallel, Programming Skelton and The Parti-
tion Global Address Space. Furthermore, it explains the most
common paradigms: Task Farming, Single Program Multiple
Data, Divide and Conquer, Data Pipelining, Hybrid models,
and Speculative Parallelism. Finally, it discusses the parallel
design algorithms, and illustrates some examples for parallel
programming models.
R EFERENCES
[1] R. Buyya et al., “High performance cluster computing: Architectures
and systems (volume 1),” Prentice Hall, Upper SaddleRiver, NJ, USA,
vol. 1, no. 999, p. 29, 1999.
Fig. 6: OpenMP Efficiency. [2] B. Barney et al., “Introduction to parallel computing,” Lawrence Liv-
ermore National Laboratory, vol. 6, no. 13, p. 10, 2010.
[3] W. Giloi, “Parallel programming models and their interdependence with
The closer model to OpenMP is TBB, thus a comparison parallel architectures,” in Proceedings of Workshop on Programming
Models for Massively Parallel Computers. IEEE, 1993, pp. 2–11.
between the two models have been made. To compare OpenMP
[4] J. Peng, C. Hu, and J. Xi, “Msi a new parallel programming model,”
and TBB based on criteria of our interest, we summaries in 2009 WRI World Congress on Software Engineering, vol. 1. IEEE,
the most significant differences between both models. As 2009, pp. 56–60.
illustrated in Table I, offloading feature is the parallelism [5] Y. Du, Y. Zhao, B. Han, and Y. Li, “Optimistic parallelism based on
process between both the host and device that mainly support speculative asynchronous messages passing,” in International Sympo-
the acceleratorbased system. OpenMP supports the offloading sium on Parallel and Distributed Processing with Applications. IEEE,
between both host and target device, while TBB only support 2010, pp. 382–391.
offloading on host side. Another feature is static scheduling [6] H. Wu, “Design-pattern based parallel programming model and system
implementation,” in 2008 4th International Conference on Wireless
where threads execution order can be controlled. The execution Communications, Networking and Mobile Computing. IEEE, 2008,
order is supported by OpenMP while it is missed in TBB. pp. 1–5.
OpenMP does not support nested and complex parallel patterns [7] F. Plavec, “Dependence testing for parallelizing compilers,” 2003.
as in TBB. In comparison, OpenMP provides construct mem- [8] Intel., “oneapi threading building blocks,”
ory hierarchy for programmers to specify memory locations. https://software.intel.com/content/www/us/en/develop/tools
Mutual exclusion for protecting data access in parallel program /oneapi/components/onetbb.html.
1028
Authorized licensed use limited to: Nitte Meenakshi Institute of Technology. Downloaded on October 27,2022 at 13:21:27 UTC from IEEE Xplore. Restrictions apply.
[9] M. H. P. C. C. SP Parallel Programming Workshop, “Message passing [15] K. P. Kenneth Pedersen, “Scientific applications in distributed systems,”
interface,” http://www.mhpcc.edu/training/workshop20/mpi/MAIN.html. http://www.idi.ntnu.no/grupper/su/sif8094-reports/2001/p7.pdf, 2001.
[10] S. H. Roosta, Parallel processing and parallel algorithms: theory and [16] “The openmp api specification for parallel programming,”
computation. Springer Science & Business Media, 2012. https://www.openmp.org//.
[11] A. Chalmers, E. Reinhard, and T. Davis, Practical parallel rendering. [17] Oracle, “Developing parallel programs - a discussion of
CRC Press, 2002. popular models,” https://www.oracle.com/technetwork/server-
[12] M. Parmar, “Data parallel model and object oriented storage/solarisstudio/documentation/oss-parallel-programs-170709.pdf,
model,” http://www.gazhoo.com/doc/201006110254229749/DATA- 2016.
PARALLEL+MODEL, 2009. [18] E. Ajkunic, H. Fatkic, E. Omerovic, K. Talic, and N. Nosovic, “A
[13] R. Galal, “Partitioned global address space (pgas),” comparison of five parallel programming models for c++,” in 2012
https://www.mohamedfahmed.wordpress.com/2010/05/06/partitioned- Proceedings of the 35th International Convention MIPRO. IEEE, 2012,
globaladdress-space-pgas/, 2010. pp. 1780–1784.
[14] I. Foster, “Designing parallel algorithms,” [19] S. Salehian, J. Liu, and Y. Yan, “Comparison of threading programming
https://www.mcs.anl.gov/ itf/dbpp/text/node14.html, 1995. models,” in 2017 IEEE International Parallel and Distributed Process-
ing Symposium Workshops (IPDPSW). IEEE, 2017, pp. 766–774.
1029
Authorized licensed use limited to: Nitte Meenakshi Institute of Technology. Downloaded on October 27,2022 at 13:21:27 UTC from IEEE Xplore. Restrictions apply.