A Cluster-Based Implementation of A Fault Tolerant Parallel Reduction Algorithm Using Swarm-Array Computing

A Cluster-Based Implementation of a Fault Tolerant Parallel Reduction Algorithm
Using Swarm-Array Computing

Chapter-I
INTRODUCTION
Scope of the project:

In this project, Intelligent Agents considered a task to be executed on a
parallel computing system is decomposed to sub-tasks and mapped onto agents
that traverse an abstracted hardware layer.
About the project:

The work reported in this project is motivated towards simulating and
implementing a ‘Swarm-Array Computing’ approach based on intelligent agents
on an FPGA (Field Programming Gate Array) and on a computer cluster
respectively. A task to be executed on a parallel computing system is decomposed
to sub-tasks and mapped onto agents that traverse an abstracted hardware layer.
The agents intercommunicate across processors to share information during the
event of a predicted core/processor failure and for successfully completing the
task. The agents hence contribute towards fault tolerance and building reliable
systems. The Swarm-Array computing framework, which comprises four
constituents and three approaches. The four constituents are the computing
platform, the problem/task, the swarm and the landscape. The computing platform
is synonymous to the hardware layer, and in the context of swarm-array
computing, parallel computing platforms such as FPGAs, clusters, grids,
supercomputers and general purpose graphical processing units (GPGPU) are
relevant. The execution of a problem/task efficiently in parallel computing
requires breaking the problem/task into sub problems/sub-tasks and mapping these
onto the computing platform. In the framework of swarm-array computing, sub
problems are mapped onto agents that contribute towards proactive fault tolerance
of the framework. A swarm within the framework is a collection of agents that
communicate with each other, and can shift from one core/processor to another. A
landscape is an abstraction of the hardware layer, over which the swarm of agents
traverse during fault tolerant execution. More details of the fundamental concepts
related to swarm-array computing are reported in the three approaches in swarm-
array computing are intelligent core based, intelligent agent based and intelligent
core and agent based. However, in this paper one among the three approaches,
namely the intelligent agent based approach, is only considered.
In the intelligent agent based approach, a task to be executed on a parallel
computing system is decomposed into sub-tasks and mapped onto agents. The
agent and the sub problem are independent of each other; in other words, the
agents only carry the sub-tasks or act as a wrapper around the sub-task.
Chapter-II
SYSTEM STUDY
Literature Study
In literature study, we gone through various research paper list of
paper listed below:
 G. Vallee, C, Engelmann, A. Tikotekar, T. Naughton, K.
Charoenpornwattana, C. leangsuksun and S. L. Scott, “A Framework for
Proactive Fault Tolerance,” in the Proceedings of the 3rd International
Conference on Availability, Reliability and Security, 2008, pp. 659 - 664.
 Y. Li, P. Gujarati, Z. Lan and X.-he Sun, “Fault-Driven Rescheduling for
Improving System level Fault Resilience,” in the Proceedings of the
International Conference on Parallel Processing, 2007.
 K. A. Hummel and G. Jelleschitz, “A Robust Decentralized Job Scheduling
Approach for Mobile Peers in Ad-hoc Grids,” in the Proceedings of the 7th
IEEE International Symposium on Cluster Computing and Grid, 2007, pp.
461 - 470.
 C. Engelmann, G. R. Vallee, T. Naughton and S. L. Scott, “Proactive Fault
Tolerance using Preemptive Migration,” in the Proceedings of the 17th
Euromicro International Conference on Parallel, Distributed and Network-
based Processing, 2009, pp. 252 - 257.
 B. Eckart, X. Chen, X. He and S. L. Scott, “Failure Prediction Models for
Proactive Fault Tolerance within Storage Systems,” in the Proceedings of
the IEEE International Symposium on Modelling, Analysis and Simulation
of Computers and Telecommunication Systems, 2008, pp. 1 - 8.
 F. Iskander and A. A. Younis, “A Proactive Fault tolerance Management
Algorithm for Mobile Ad Hoc Networks,” in the Proceedings of the 4th
IEEE Consumer Communications and Networking Conference, 2007, pp.
571 - 575.
 P. Tichy, P. Slechta, R. J. Staron, F. P. Maturana and K. H. Hall, “Multi-
agent Technology or Fault Tolerance and Flexible Control,” in the IEEE
Transactions on Systems, Man and Cybernetics, Part C: Application and
Reviews, 2006, pp. 700704.
 SeSAm website: http://www.simsesam.de
 M. J. Quinn, “Parallel Computing Theory and Practice,” McGraw-Hill, Inc.
1994.
 J. D. Sloan, “High Performance Linux Cluster with OSCAR, Rocks,
openMosix & MPI,” O’Reilly, 2005.
 W. Gropp, E. Lusk and A. Skjullum, “Using MPI-2: Advanced Features of
the Message Passing Interface,” MIT Press, 1999.
 Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek and V. Sunderam,
“PVM: Parallel Virtual Machine A Users’ Guide and Tutorial for
Networked Parallel Computing,” MIT Press, 1994.
 Center for Advanced Computing and Emerging Technologies (ACET)
website: www.acet.reading.ac.uk
 High Performance Computing at ACET website:http://hpc.acet.rdg.ac.uk/
From this we analyzed the issues in those papers above mentioned. We provided
some solutions to overcome those issues in our proposed system.
Chapter-III
Problem Definition
Existing System
The existing routing protocols that perform global rerouting need to trade
off between forwarding continuity and routing stability. This method is not
suppressing the failure notification.
The major drawbacks of the existing system are as follows
• Time consumption
• Low reliability
• Error prone
• Low speed communication
Proposed System
We propose a swarm array computing approach that addresses this issue by
employing interface-specific forwarding, and by performing local rerouting using
a back warding table upon a failure while suppressing the failure notification.
The Following function avoids the pitfalls of the existing system.

• Fast and efficient work
• Ease of access to system
• Manual effort is reduced
Feasibility Study
In our system, we use java for developing application in a efficient manner.
Let we discuss about it briefly
Java
Introduction:
Java quickly became a hot buzzword of the computing industry. People
wanted to know Java - it was said to be great for creating dynamic interactive
content for WebPages. Yet the true power of Java lies not in applets, but in its
many other uses. Java is used for developing standalone applications, and for
server-side programming. The face of Java has changed, but the core language
remains the same. In this tutorial series, I'll teach you the basics of Java
programming. You'll still need a good book as a companion to this tutorial series,
but for those who are dabbling in Java, this should be enough to get your feet wet.
Java is an object-orientated language, and may not be suitable for first time
programmers. Learning a new language takes some time, but learning your first
object-orientated language can be exceedingly difficult. Nonetheless, if you've
done some C programming before, the shift into Java shouldn't be unreachable,
providing you obtain a good reference book. There are also those that believe
programmers should start with an object-orientated language first, and many
universities have adopted this practice. Still, you've been fairly warned.
This tutorial will presume that you have some basic programming
knowledge, particularly in C, as it will not be covering such principles as
sequence, selection and repetition. If you are unsure on ' for ' loops, or complex ' if
' statements, I'd suggest coming back here at a later point.
Application or Applet?
Java software comes in several flavors - the most common being the stand-
alone application, and the applet. Web developers may have come across the term
applet before, and perhaps even used one. An applet is an piece of software code
that runs under the control of a web browser, as distinct from the application
which requires an interpreter.
Applets are commonly used to enhance the interactivity of a web page, and
deliver client-side content. Applets run in their own frame, and can display
graphics, accept input from GUI components, and even open network connections.
Due the potential security risks associated with running applets from external and
potentially malicious sources, most web browsers limit file access, and impose
additional restrictions on applets (such as only being able to connect to the
hostname from which the applet was downloaded).Fortunately, stand-alone
applications have no such restrictions, and a full range of functionality is provided
for in the way of pre-written Java classes.
Stand-alone applications can run as a console application (writing text to

the screen or terminal window), or they can have a graphical user-interface, by
opening a new window or dialog box. You've used applications before, such as
word processors, text editors, and games.
The Java language is capable of all this things. Since stand-alone
applications offer more freedom to the programmer, and applets running under a
browser often demonstrate a certain degree of instability depending on the
platform under which it is run, this tutorial series will concentrate primarily upon
the stand-alone application.
The first thing required for writing stand-alone Java applications is a java
compiler/interpreter. While there are commercial offerings available, such as
Visual J++ and Borland JBuilder, a freely available SDK is available from Sun,
the original creators of the Java language. It contains a compiler, interpreter,
debugger, and more.
How Java works?

For those new to object-orientated programming, the concept of a class
will be new to you. We defined a new class, called my first java program.
Simplistically, a class is the definition for a segment of code that can contain both
data (called attributes) and functions (called methods).
When the interpreter executes a class, it looks for a particular method
by the name of main, which will sound familiar to C programmers. The main
method is passed as a parameter an array of strings (similar to the argv[] of C), and
is declared as a static method (more on this in a later tutorial).
To output text from the program, we execute the ' println ' method of
System. Out, this is Java’s output stream. UNIX users will appreciate the theory
behind such a stream, as it is actually standard output. For those who are instead
used to the Wintel platform, it will write the string passed to it to the user's screen.
That wraps it up for this first part of the introduction to Java tutorial series.
In the next tutorial, we'll cover some more object-orientated principles, and extend
your knowledge of the Java language and syntax.
Java is of two things
The Java Programming Language

Java is a high-level programming language that is all of the following:
 Architecture-
 Simple
neutral
 Object-oriented  Portable
 Distributed  High-performance
 Interpreted  Multithreaded
 Robust  Dynamic
 Secure
Java is also unusual in that each Java program is both compiled and
interpreted. With a compiler, you translate a Java program into an intermediate
language called Java byte codes--the platform-independent codes interpreted by
the Java interpreter. With an interpreter, each Java byte code instruction is parsed
and run on the computer. Compilation happens just once; interpretation occurs
each time the program is executed. This figure illustrates how this works.
Fig 7.3.1.a Interpretation

You can think of Java bytecodes as the machine code instructions for the Java
Virtual Machine (Java VM). Every Java interpreter, whether it's a Java
development tool or a Web browser that can run Java applets, is an
implementation of the Java VM. The Java VM can also be implemented in
hardware.
Java bytecodes help make "write once, run anywhere" possible. You can
compile your Java program into bytecodes on any platform that has a Java
compiler. The bytecodes can then be run on any implementation of the Java VM.
For example, the same Java program can run on Windows NT, Solaris, and
Macintosh.
Fig .b
Java Bytecode
The Java Platform
A platform is the hardware or software environment in which a program
runs. The Java platform differs from most other platforms in that it's a software-
only platform that runs on top of other, hardware-based platforms. Most other
platforms are described as a combination of hardware and operating system.
The Java platform has two components:
• The Java Virtual Machine (Java VM)
• The Java Application Programming Interface (Java API)
You've already been introduced to the Java VM. It's the base for the Java
platform and is ported onto various hardware-based platforms.
The Java API is a large collection of ready-made software components that
provide many useful capabilities, such as graphical user interface (GUI) widgets.
The Java API is grouped into libraries (packages) of related components. The next
section, What Can Java Do? Highlights each area of functionality provided by the
packages in the Java API.
The following figure depicts a Java program, such as an application or
applet, that's running on the Java platform. As the figure shows, the Java API and
Virtual Machine insulates the Java program from hardware dependencies.
Java Platform
As a platform-independent environment, Java can be a bit slower than

native code. However, smart compilers, well-tuned interpreters, and just-in-time
byte code compilers can bring Java's performance close to that of native code
without threatening portability.
The Life Cycle of an Object:

Typically, a Java program creates many objects from a variety of classes.
These objects interact with one another by sending each other message. Through
these object interactions, a Java program can implement a GUI, run an animation,
or send and receive information over a network. Once an object has completed the
work for which it was created, it is garbage-collected and its resources are
recycled for use by other objects. Follow the links below to learn about the typical
phases of the life of an object:
 Creating Objects
 Using Objects
 Cleaning Up Unused Objects
The Garbage Collector
The Java platform has a garbage collector that periodically frees the
memory used by objects that are no longer needed. The Java garbage collector is a
mark-sweep garbage collector. A mark-sweep garbage collector scans dynamic
memory areas for objects and marks those that are referenced.
After all possible paths to objects are investigated, unmarked objects

(unreferenced objects) are known to be garbage and are collected. (A more
complete description of Java's garbage collection algorithm might be "a
compacting, mark-sweep collector with some conservative scanning.")
The garbage collector runs in a low-priority thread and runs either

synchronously or asynchronously depending on the situation and the system on
which Java is running. It runs synchronously when the system runs out of memory
or in response to a request from a Java program.
The Java garbage collector runs asynchronously when the system is

idle, but it does so only on systems, such as Windows 95/NT, that allow the Java
runtime environment to note when a thread has begun and to interrupt another
thread. As soon as another thread becomes active, the garbage collector is asked to
get to a consistent state and terminate.
Chapter-IV
Requirement Analysis
Functional Requirements
A functional requirement defines a function of a software system or its
component. A function is described as a set of inputs, the behavior, and outputs.
Modules:
Module Description:
• Packet Flowing
• Path Detection
• Detecting link Failure Nodes
• Rerouting Data
Packet Flowing:
This Module used to create the links for packets flowing. This module
execute in source node. We define network availability as the fraction of time the
network is able to forward packets between all source-destination pairs. Since a
forwarding loop is possible during the network transient period, we consider all
the network transient periods as unavailable time for both OSPF. Besides, under
OSPF, when a router suppresses a failed link, forwarding between some source-
destination pairs could be disrupted. We, therefore, count suppression periods too
as unavailable time under OSPF. On the other hand, to guarantees forwarding
correctness when at most one link failure is suppressed.
Path Detection:
Under FIR, only the node adjacent to a failed link is aware of the failure
and all other nodes are not. So, a packet takes the usual shortest path till the point
of failure and then gets rerouted along the alternate path. Consequently, in the
presence of link failures, FIR may forward packets along longer paths compared to
the globally recomputed optimal paths based on the link state updates. For
example in the topology when the link2–5 is down, packets from 1 to 6 are
forwarded along the path. Had node 1 been made aware of the link failure, packets
would be forwarded along the shorter path. However, we show that the extent of
this elongation is not significant. Let stretch of a path between a pair of nodes be
the ratio of the lengths of the path under FIR and the optimal shortest path.
Detecting link failure node:
A parallel summation algorithm that incorporated the fault tolerant concepts

using swarm-array computing. The fault tolerant concepts incorporated in the
parallel summation algorithm are with respect to the approaches and constituents
of the swarm-array computing framework. When the temperature of a node rises
beyond a threshold, the process executing on that node predicts a failure and hence
spawns a process on an adjacent core in the abstracted layer. The agent on the
abstracted core expected to fail shifts to the adjacent core on which the new
process was spawned. The dependency information carried by the agent that was
shifted to the new core is employed to reinstate the state of execution of the
algorithm.
Rerouting Data
Each node is characterized by input dependencies, output dependencies and

data contained in the node. The first level nodes have one input dependency and
one output dependency. For instance, node N1 has one input dependency I1 and
node N9 as its output dependency. However, the second, third and fourth levels
have two input dependencies and one output dependency. The data contained in a
node is either the input data for the first level nodes or a calculated value stored
within a node.
Software Requirements:
Front End : JDK 1.5 and Above
Operating System : Windows XP
Documentation : Ms-Office 2007
Hardware Requirements:
Processor : Intel Pentium 4
RAM : 1 GB
Hard Disk : 30 GB
Non Functional Requirements
Non-functional requirements are often called qualities of a system. Other

terms for non-functional requirements are "constraints", "quality attributes",
"quality goals", "quality of service requirements" and "non-behavioral
requirements". Qualities, that is, non-functional requirements, can be divided into
two main categories:
1. Execution qualities, such as security and usability, which are observable at

run time.
2. Evolution qualities, such as testability, maintainability, extensibility and
scalability, which are embodied in the static structure of the software
Adjacent Node
system (If link fail Send the packets
Send the packets execute rerouting)
Some non functional requirements
 Security
 Reliability
 Performance
 Response Time Destination
Source  Robustness
Adjacent Node
(If link fail
execute rerouting)
Chapter V
SYSTEM DESIGN
System Architecture
Send the packets Send the packets
Adjacent Node
(If link fail
execute rerouting)
UML Diagram
Use case Diagram
Root request
send
Re-
Path route
received
Source
Data send
Request
received
Inter
Reply
Received
data
Destination
Sequence Diagram
Source Router Path Received Destination
Root Request
Path Received
Data Send
Received Data

A Cluster-Based Implementation of A Fault Tolerant Parallel Reduction Algorithm Using Swarm-Array Computing

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Cluster-Based Implementation of A Fault Tolerant Parallel Reduction Algorithm Using Swarm-Array Computing

Uploaded by

Copyright:

Available Formats

A Cluster-Based Implementation of a Fault Tolerant Parallel Reduction Algorithm

Using Swarm-Array Computing

Scope of the project:

About the project:

The Following function avoids the pitfalls of the existing system.

Stand-alone applications can run as a console application (writing text to

How Java works?

The Java Programming Language

Fig 7.3.1.a Interpretation

As a platform-independent environment, Java can be a bit slower than

The Life Cycle of an Object:

The Garbage Collector

After all possible paths to objects are investigated, unmarked objects

The garbage collector runs in a low-priority thread and runs either

The Java garbage collector runs asynchronously when the system is

Detecting link failure node:

A parallel summation algorithm that incorporated the fault tolerant concepts

Each node is characterized by input dependencies, output dependencies and

Non-functional requirements are often called qualities of a system. Other

1. Execution qualities, such as security and usability, which are observable at

Send the packets Send the packets

You might also like