Professional Documents
Culture Documents
net/publication/331824476
CITATIONS READS
0 953
3 authors:
Sheeraz Memon
Mehran University of Engineering and Technology
23 PUBLICATIONS 203 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Plant Diseases Detection using Content Based Image Retrieval View project
All content following this page was uploaded by Dileep Kumar on 17 March 2019.
Abstract
High Performance Computing evolved by reason of increase in demand of computational power.
This is due to increase in the need for fast computations. High performance computing increasingly
is used in all kinds of fields. Nowadays in engineering & science, for solving scientific problems it
requires huge execution time so, viable hardware based supercomputers on the other hand are
very costly to build and supervise. A new technique is emerged to create parallel systems that
provides the functionality of supercomputing through the use of inexpensive old PCs, fast local area
networking and free and open source Linux operating system and softwares. These parallel
systems called as High Performance Computing Clusters or Beowulf Clusters.The multiple PC’s are
interconnected to combine computational power of multiple machines or nodes to reduce the
execution time. The management of such clusters is a difficult task. A variety of tools are available
but proper selection & integration is always challenging.This paper presents High Performance
Computing clusters approach on Linux platform such that Ubuntu that describes the steps
necessary to create a cluster and providing an implementation of MPI (Message Passing Interface)
based HPC cluster. Finally, the performance of cluster environment is tested by showing the
comparisons of execution times and speedup on cluster with different numbers of processes.
I. Introduction
High Performance Computing is field of computer science in which supercomputers are used to
solve challenging scientific problems.These scientific problems are highly intensive in nature initially
recognized and then modeled in mathematical expressions. Mathematical expressions are normally
the differential and integral equations. These expressions cannot be directly executed on computer
machines, first to be converted into parallel programs then executed on large number of High
Performance Computers for the solution to reduce the computational time [A]. The solution can be
huge amounts of numerical data, or the representation of numerical data in the form of an image or
any animation using visualization techniques, used to realize these type of scientific problems.
3
Academic Journal of Management Sciences 6(1) 3 – 12
Problem is that a great amount of money and effort can be spent to build very fast, specialized
High Performance supercomputers. These supercomputers are often optimized for specific problems
in a particular field, are expensive, have unique operating systems and application software that
require trained peoples.
A solution to the problem of getting faster computers is to create a low cost High Performance
Computing (HPC) clusters using old commodity hardware combined with open source, free operating
systems & softwares that provides the functionality like a commercial High Performance
supercomputer and can be applied to several fields to solve the advanced computational problems.
Open source and free operating systems & softwares like Linux & MPI have drop down the marginal
cost near to zero. HPC clusters provide an inexpensive parallel computing system to aggregate
computing power of multiple computers and dramatically reduce the time needed to find the solutions
of problems that requires the analysis of larger amounts of data.
4
Academic Journal of Management Sciences 6(1) 3 – 12
IV. Objectives
V. Related Work
In our review we found that under the term High-performance Computing and Cluster
Computing, there are several projects that are related to our project.
In [H], used the commodity hardware and implemented the Beowulf low cost cluster that can be
utilized for learning purpose at undergraduate and postgraduate levels.
In [I], showed the method to install and configure HPC cluster on the Linux distribution and
describes the structure of the cluster.
In [J], they showed how to assemble a disk-less cluster but do not take in any experiment done
with parallel programming.
In the [K], they proposed a HPC approach on Ubuntu Linux by using Parallel Programming
environment with the number of multiple nodes for bigger computations. They described method for
installing cluster environment using the PXE (Preboot Execution Environment) which requires the
DHCP & TFTP protocols installation and configuration.
In the [L], implemented the clustering environment to solve large mathematical operation
quicker as matrix multiplication and PI calculation. The users can right to access any node of the
cluster and use it separately as a local personal computer.
5
Academic Journal of Management Sciences 6(1) 3 – 12
Designing the Linux HPC cluster we need a set of cluster nodes networked together for sharing
the computing resources. In the proposed HPC cluster star network topology deployed consist of:
A master node
Two slave nodes
Fast Ethernet 100 Mbps switch
PTCL DSL modem as a Gateway Device
Cables, and other networking hardware
B. Cluster Working
The user actually interacts with the master node & submits the task (job) to it. Master node is
the only controlling node, which trace the number of nodes within the cluster including itself. After the
task submission by the user the master node divides the task among the nodes including it (master
node) means that the master node also takes a part in the computation. After the computations of the
task from each node, the results will be the unified & return back.
6
Academic Journal of Management Sciences 6(1) 3 – 12
The assigned task will be divided into the number of the processes and each process
associated with the unique Process ID inside the communicator. A set of communicating processes is
called as Communicator. The MPI_COMM_WORLD is default communicator. The number of
processes called as “Size” of the communicator and the Process ID called as “Rank”. Processes run
on nodes and communicate by exchanging messages. In the proposed HPC cluster we used 3 nodes
each of Core 2 Duo processor architecture with 2 CPUs. A single process runs on "single processor”,
it means that 2 processes will be assigned to each node and a cluster executes 6 processes at a time.
If the processes are more than 6 remaining processes will be assigned to those CPUs who, completed
the execution of previous tasks first according to First Come First Serve (FCFS).
D. Cluster Implementation
The main steps of implementing the HPC cluster are mentioned in the following “Figure 7”.
7
Academic Journal of Management Sciences 6(1) 3 – 12
E. Hardware Configuration
The nodes used of same architecture & having smaller different specifications for CPU &
memory.
Table 1: Hardware configuration
We evaluated the performance of the cluster by implementing a parallel program for calculating
the value of “pi”. The performance can be measured in terms of execution time and speed up.
8
Academic Journal of Management Sciences 6(1) 3 – 12
The execution time of a code is the time spent by the system executing that code. Speedup of a
parallelized code defined as:
𝑠𝑒𝑟𝑖𝑎𝑙 𝑒𝑥𝑒𝑐𝑢𝑡𝑖𝑜𝑛 𝑡𝑖𝑚𝑒 (𝑇𝑠)
𝑆𝑝𝑒𝑒𝑑𝑢𝑝 =
𝑝𝑎𝑟𝑎𝑙𝑙𝑒𝑙 𝑒𝑥𝑒𝑐𝑢𝑡𝑖𝑜𝑛 𝑡𝑖𝑚𝑒 (𝑇𝑝)
A. Pi Calculation
The problem is to calculate the value of “pi”, for solving this problem a mathematical model is
needed and then it will be converted in the computer code. A mathematical constant “pi”, is the ratio of
a circle's circumference to its diameter, whose approximate value is equal to 3.14159. Its value can be
calculated by:
1
4
𝑑𝑥 = 𝜋
0 1 + 𝑥2
We compare the value of calculated “pi” with the original value and find out the accuracy of the
output and the time taken by program will also displayed.
9
Academic Journal of Management Sciences 6(1) 3 – 12
VIII. Conclusion
According to the results it is clear that the HPC cluster is functioning as expected and we are
able to combine the computational power of the different nodes. From the results we observed that,
when the task is executed serially by creating only one process it takes much time as compared to the
other results. But when we divide the task into a total of 6 processes as shown in “Figure 8" and
“Table 2”, then the cluster shows the maximum performance in terms of execution time and speedup.
It is because of we have a total of 3 nodes and each node is of Core Duo architecture, have 2 cores or
CPUs so totally we have 6 number of cores within the cluster. The proposed cluster assigns 6
processes at a time among the 6 cores for execution and in case we create 10 processes, the
remaining 4 will be assigned to the cores next time after the execution of previous tasks or processes
on First Come First Serve (FCFS) basis. We observed variations in the results when the processes
greater than the 6 are generated. For getting the better results the task should be divided into the
number of processes equals to the number of cores or CPUs available within the cluster.
Sometimes we observed when any machine not working for any reason then the whole cluster
becomes failure and not performing any job that is submitted to it. Although administering a large HPC
cluster is a challenging job, because of any reason if anyone or some of machines may become
failure, it should not affect the whole cluster to become failure by doing their job. We will take part to
resolve this issue so that cluster can continue its job even if a machines not working within the cluster.
This will significantly improve the cluster performance. We have used the Fast Ethernet (100 Mbps) for
backend communication, it can be upgraded to a Gigabit Ethernet.
10
Academic Journal of Management Sciences 6(1) 3 – 12
X. References
[A] Vecchiola, C., Pandey, S. and Buyya, R. (2009), December, “High-Performance Cloud Computing:
A View of Scientific Applications”, ISPAN '09 Proceedings of the 2009 10th International Symposium
on Pervasive Systems, Algorithms and Networks held on December 14, 2009, pp.4-16.
[B] Gupta, A. and Milojicic, D. (2011), October, “Evaluation of HPC Applications on Cloud”, TitleOCS
'11 Proceedings of the 2011 Sixth Open Cirrus Summit held on October 12, 2011 , pp. 22-26.
[C] Kaur E.R. (2015), “A Review of Computing Technologies: Distributed, Utility, Cluster, Grid and
Cloud Computing”, International Journal of Advanced Research in Computer Science and Software
Engineering (IJARCSSE) , ISSN: 2277 128X, Vol. 5 , Issue 2, pp. 144-148.
[D] Kahanwal, B. and Singh, T.P. (2012), “The Distributed Computing Paradigms: P2P, Grid, Cluster,
Cloud, and Jungle”, International Journal of Latest Research in Science and Technology, ISSN: 2278-
5299, Vol. 1, Issue 2 ,pp. 183-187.
[E] Ngxande, M. and Moorosi, N. (2014), “Development of Beowulf Cluster to Perform Large Datasets
Simulations in Educational Institutions”, International Journal of Computer Applications, Vol. 99 (15),
pp. 29-35.
[F] Adams, J. and Vos, D. (2002), “Small College Supercomputing: Building a Beowulf Cluster at a
Comprehensive College”, 33rd SIGCSE Technical Symposium on Computer Science Education,
Covington, KY, held on February 27 - March 03, 2002, pp. 411-415.
[G] Rajput, V. and Katiyar, A. (2013), “Proactive Bottleneck Performance Analysis in Parallel
Computing Using OpenMP”, International Journal of advanced studies in Computer Science and
Engineering (IJASCSE) , Vol. 2, Issue 5, pp. 46-53.
[H] Datti, A.A., Umar, H.A. and Galadanci, J. (2015), “A Beowulf Cluster for Teaching and Learning”,
th
4 International Conference on Eco-friendly Computing and Communication Systems, Vol. 70, pp. 62
– 68.
[I] HaiTao, W. and ChunQin, C. (2009) “A High Performance Computing Method of Linux Cluster’s”,
Proceedings of the 2009 International Symposium on Information Processing (ISIP’09), Huangshan,
P. R. China, held on August 21-23, 2009, pp. 083-086.
[J] Brightwell, R., Riesen, R. and Underwood, K. (2003), “A Performance Comparison of Linux and a
Light weight Kernel”, Proceedings of the IEEE International Conference on Cluster Computing, Hong
Kong, China, held on December 1-4, 2003, pp. 251-258.
11
Academic Journal of Management Sciences 6(1) 3 – 12
[K] Chowdhury, S.S., Jannat, M.-E. and Shoeb, A.A.Md. (2012), “Performance Analysis of MPI
(mpi4py) on Diskless Cluster Environment in Ubuntu”, International Journal of Computer Applications
(IJCA), 0975 – 8887, Vol. 60(14), pp. 40 - 46.
[L] Al-Khazraji, S.H.A.A., Al-Sa'ati, M.A.Y. and Abdullah, N.M. (2014), “Building High Performance
Computing Using Beowulf Linux Cluster”, International Journal of Computer Science and Information
Security (IJCSIS), ISSN: 1947-5500, Vol. 12(4), pp. 1 - 7.
[M] Rahman, A. (2015), “High Performance Computing Clusters Design and Analysis Using Red Hat
Enterprise Linux”, TELKOMNIKA Indonesian Journal of Electrical Engineering, ISSN: 2302-4046, Vol.
14(3), pp. 534-542.
[N] Ruan, X., Yang, Q., Alghamdi, M.I., Yin, S., Ding, Z., Xie, J., Lewis, J. and Qin, X. (2010), “ES-
MPICH2: A Message Passing Interface with Enhanced Security”, 2010 IEEE 29th International
Conference on Performance Computing and Communications, Vol. 9(3), pp. 161-168.
12