Professional Documents
Culture Documents
Performance Improvement in Distributed Systems Through Replication and Checkpointing
Performance Improvement in Distributed Systems Through Replication and Checkpointing
17
International Journal of Computer Applications (0975 – 8887)
Volume 42– No.19, March 2012
Checkpointing, Replication are some fault tolerance result could have been produced by executing that program on
techniques used in distributed system. Among these we have single processor system.
focused on two techniques: Replication and Checkpointing.
In order to have consistency an efficient strategy is required.
2.1 Replication There are two main strategies one is active replication strategy
Replication means redundant construct of the data or simply and another is passive replication strategy. This classification
we can say multiple copies of same data. It is used to increase is based on the way in which modification is done by the
the availability of data. Replication based approach [13] is one server. In passive replication, only one primary server
of the famous and efficient approach for improving fault executes the client requests and spreads the changes to all
tolerance of the distributed system. If at some instance a fault remaining replicas. With this scheme we can get rid of the
occurs in the system and it gets out of order then at that time multiple computations of the same requests. In active
this technique would work and retain data from the node, replication, client request is spreads to all replicas and
where data is replicated. Replication is a process of computation is carried out at each node. All replicas execute
maintaining different copies of a data item or object. In these the client request separately. Advantage of active replication
types of techniques, all nodes have the same set of resources is that it takes less network bandwidth and resources than
and data is replicated to these nodes so client can forward sending update in contrast to the passive replication where
requests to any replica among the set of replicas. Replication primary server sends update to other replicated servers. Also
adds redundancy in the system [13]. If any one of the node get the active replica response to a fault is faster than passive
failed then data of that node is available to its replicas thus replication. There are various techniques of fault tolerance
failure of some node will not result in the failure of whole which used both type of strategies active and passive
system. In this way fault-tolerance is achieved [13] as shown strategies to implement optimistic replication protocol [11].
in fig 1. Despite of these protocols there is still need of more simple,
and effective replication protocol that can ensure consistency
considerably.
18
International Journal of Computer Applications (0975 – 8887)
Volume 42– No.19, March 2012
Basically the main aim of a recovery operation is to fetch the While applying various fault tolerance techniques like
system back into a consistent state as in normal condition. replication and checkpointing the major problem which would
Two most important types of rollback recovery are checkpoint generally creates disturbance is its overhead. An
based rollback recovery and log based rollback recovery [13]. asynchronous replication strategy is developed [22] that
From these two techniques Checkpoint-based rollback distributes checkpoint or replication overhead over all nodes
recovery depends only on checkpoints [13] but Log-based participating in the computation. Two important factors are
rollback-recovery unites checkpointing with logging of those addressed in this strategy; first the resiliency of checkpoints to
events which are not having determinism [20]. Two types of node failure, second; the bottleneck sustained by transferring
Checkpointing techniques are coordinated checkpoint and data to dedicated servers. Walter and Choudhary proposed the
uncoordinated checkpoint associated with message logging use of uniform distribution in [22] for storing checkpoints
used for saving the checkpoints and recovering when failure because success of any strategy involving this relies on the
occurs [21]. In coordinate checkpoint, the name coordinate availability of data. Each checkpoint has an equal query
shows coordination among processes. All processes share probability so uniform distribution is justified for this specific
their checkpoints and save a complete snapshot of the whole case.
system [2]. In simple we can say process coordinate check
points and these are consistent set of checkpoints. Due to An algorithm is given by Walter and Choudhary in [22]
consistent set of checkpoints, consistency is more in case of known as random node selection algorithm. The main aim of
coordinate check points [2]. There is a drawback in this algorithm is to provide location, as the output, where
coordinated checkpoint because it does the rollback check checkpoints to be replicated. For Old approach of replication
point of all processes from the last saved state when any there is a tendency to store replicas at farthest node but this
failure occurs, even only one process having problem. So it may also cause low performance in the presence of inadequate
takes long time to recover from the failure state which makes network bisection bandwidth. Thus this algorithm will be
it inappropriate for various applications. This technique useful in these types of cases [22].
cannot be used in the system which suffers from the frequent It is implemented by authors in LAM/MPI environment. The
failures whereas; the performance can be improved by aim of this algorithm is to generate same number of replicas
decreasing the recovery time. It takes long time to recover to each node. They have managed the algorithm in such a way
because all the processes restart from the initial state. This that no node has its own replica on itself. The number of
time can be lessened if recovery operation brings the whole replicas of a node n is same as the number of replicas of n
system to the latest correct state instead of initial state. These spread onto the network and all replicas are unique.
issues can be handled by using second type of checkpointing
technique i.e. uncoordinated checkpointing. As the name The algorithm provides an initial state to the system which
suggests, there is no coordination among the processes like satisfies the above mentioned constraint. After that circular
previous technique; whereas all the processes save the shift operation is used to provide the first stage of
checkpoints independently. But this strategy would suffer randomness. Swapping operation is used between nodes
from lack of coordination. So message logging is mingled various times for maintaining randomness. During each swap
with uncoordinated checkpoint to overcome from this operation it is managed that all the constraints should satisfy
problem. This will help the system to reach to the latest the constraints imposed. Finally, the actual replica swap is
correct state after the failure. With message logging, system performed after many confirmations and actual result is
stores all the messages and resends them in the same order, generated.
after the system recovers from the failure.
3. RMI
2.3 Combined Approach (Replicated RMI i.e. Remote method invocation is a construct of JAVA
[6] which is used to help applications in communication with
Checkpointing) other application that is running on a remote system. There
There are several issues with above mentioned techniques. are several other methods exist which can be used for remote
Major issue is cost, checkpointing based fault tolerance communication but they have many limitations like only
mechanism is quite expensive. So researcher proposed new some defined structure or simple data types can be passed to
technique in which replication and check-pointing is used and from methods. An important feature of RMI is that it
jointly to improve the performance [21]. It is known as permits to load new object types dynamically even if client
replicated checkpointing. Some issues are still there which is and server don‟t know about it.
to be handled while using it. These issues are primarily
occurrence of checkpointing, level of replication, storage of 3.1 Overview:
check pointing, location and size of checkpoints. Researchers RMI allows you to call procedures [27] on objects that reside
also gave the idea of deciding degree of replication and on other virtual machine and treat them as if they were on the
frequency of checkpointing at run time. This technique is local machine. Concept of RMI is very simple there is a client
known as adaptive task checkpointing and replication [23]. and a server: the server has the method which is to be called
Two main important features related to these techniques are remotely by a client application. There is also a record known
efficient recovery and recovery in short time. The distributed as RMI registry which is used to handle all remote methods.
system is the collection of different types of processors or RMI service provides a new name to remote method and does
systems. These are called heterogeneous system. So it is its entry in RMI registry by using binding method. Once the
highly required for all the recovery techniques to efficient client has this reference, it can make remote method calls with
recover the system from failure in this type of environment. parameters and return values as if the object (service) were to
On the other hand apart from efficiency time limit is also a be on the local host. Java RMI is comprised of three layers [5]
considerable factor. Various researches are going on to that support the interface.
improve the recovery time. Old strategy of recovery is to just
redo the computation of the failed processes since the last There are three layers in the architecture of RMI. The first
checkpoint [2]. layer is the Stub/Skeleton Layer. This layer provides interface
to both client and server. Both client and server communicate
19
International Journal of Computer Applications (0975 – 8887)
Volume 42– No.19, March 2012
with each other by using remote object. This layer manages 1) The algorithm given in [22] is implemented in LAM/MPI
the remote object interface. The name of second layer is environment whereas we implemented it in JAVA RMI
Remote Reference Layer (RRL). This layer acts as an which is quite easy to understand.
interface between first and third layer. The role of this layer is
2) We also gave the flexibility of increasing fault tolerance at
to manage communication between the client/server and
run time. If there is a need of increasing number of nodes
virtual machines, (e.g., threading, garbage collection, etc.) for
or number of replica at run time then it can be fulfilled by
remote objects [5]. The name of third layer is the transport
our proposed algorithm.
layer. This layer is like data link layer /physical layer of
TCP/IP suite. It is also the lowest layer of the protocol suite 3) Consistency is a major issue while doing replication. We
and it is used to send the information between the client and have used a coordinator algorithm to ensure consistency.
server over the wire. 4) We have also implemented idea of saving checkpoints on
the local hard disk instead on dedicated server.
4. PROPOSED WORK
The random node selection algorithm [22] is very efficient for 5. CONCLUSION
replication and authors worked very well and implemented it We deeply analyzed the Random Node Selection Algorithm,
in LAM/MPI environment. We have taken the same algorithm and find out some shortcomings in that, and then we
and tried to develop it in Java RMI technology. There are redesigned the algorithm including various points in such a
many technologies exists to create distributed environment way that the overall complexity of the algorithm is less as that
but we employed RMI because of its familiar environment of earlier. We have proposed an improved method to ensure
and it is presented as efficient technology for distributed the consistency by simulating the distributed environment
system by Jerzy Brzezinski and Cezary Sobaniec in [6]. using java RMI. This algorithm is very simple and ensures the
The random node selection algorithm talked only about consistency in a very simple manner. Checkpointing overhead
replication by providing the addresses of the nodes where is reduced by saving the checkpoints on local hard disk
replicated data to be stored. But it didn‟t tell about what to do instead of SNA (Storage Network Area) or DFS (Distributed
if there is a requirement of increasing fault tolerance File System) following the assumptions given by John Paul
capability by increasing number of nodes or by increasing Walters. This work will definitely work as a reference for
number of replicas. Thus, we modified and implemented this researcher and practitioner to design and develop high
algorithm to handle these important issues. The issues are as performance multiple fault tolerance.
follows:
6. ACKNOWLEDGEMENT
Case 1: When New Node needed to be inserted in an already We would like to thank the all faculty members of the
developed distributed system. institute who helped us lot in calculating the facts and figures
Case 2: When number of replicas required to be increased in related to our paper. I would also like to thank the anonymous
an already developed distributed system to improve its fault reviewers who provided helpful feedback on my manuscript.
tolerance capability.
7. REFEERENCES
We developed the new algorithm to handle these issues and
got implemented it in RMI. Our implementation contains the [1] Daniel Oelke, “Overview of Distributed computing”,
following modules: Mithral Communications & Design Inc. 1995-2012 ,
[Online]Available:http://www.mithral.com/projects/cosm
1) A Computational Coordinator that have algorithm /ch-02.html
implementation.
2) The Client Systems that needed to communicate with [2] Sanjay Bansal, and Sanjeev Sharma, “Identification of
coordinator to know about the replica placement. Critical Factors in Checkpointing Based Multiple Fault
Tolerance for Distributed System”, Journal of Emerging
3) All the systems in the network needed to communicate Trends in Computing and Information Sciences,
each other to find or place their replicas over the network. Volume 2 No. 1, 2010.
This new program has the capability to quench the thirst of
increasing fault tolerance. It provides users to increase the no [3] Halpern, J. and Y. Moses, "Knowledge and Common
of nodes or replicas after initiating the program. After final Knowledge in a Distributed Environment," Proc. of the
execution of the algorithm at the Coordinator side, all clients 3rd ACM Symposium on Principles of Distributed
receive a file containing IP addresses of the nodes selected for Systems, 1984, pp. 50-61 and Lamport, L., R. Shostak,
the replication of their data. One program has also been and M. Pease, "The Byzantine Generals Problem," ACM
developed for communication of the clients. In it, clients Transactions on Programming Languages and Systems,
finally replicate their data to the addresses received in the file Vol. 4 No. 3, July 1982, pp. 382-401.
by the coordinator. With the use of this algorithm checkpoints
are replicated and fault tolerance is achieved. [4] Jalote, P. Fault Tolerance in Distributed Systems,
(Prentice Hall, 1994).
4.1 Comparative study
[5] Chris Matthews, “Introduction to Java Remote Method
The basis of this work is the random node selection algorithm Invocation (RMI)”, The Electronic Developer Magzine,
given by Walter and Choudhary. Their work is quite [Online]Available: http://www.edm2.com/0601/rmi1.html
advantageous to the field of distributed system. We tried to
work on the platform created by them. There are some points [6] A Concept of Replicated Remote Method Invocation
regarding comparative study of our work and their work. Jerzy Brzezinski and Cezary Sobaniec, Institute of
Computing Science, Poznan University of Technology,
20
International Journal of Computer Applications (0975 – 8887)
Volume 42– No.19, March 2012
[8] M. Herlihy and J. Wing. “Linearizability: a correctness [17] Cao Huaihu, Zhu Jianming, “An Adaptive Replicas
condition for concurrent objects,” ACM Trans. on Progr. Creation Algorithm with Fault Tolerance in the
Languages and Syst., 12(3):463-492, 1990. (IJIDCS) Distributed Storage Network” 2008 IEEE.
International Journal on Internet and Distributed
Computing Systems. Vol: 1 No: 1, 39 [18] N. Budhiraja, K. Marzullo, F.B. Schneider, and S. Toueg.
The Primary-Backup Approach. In Sape Mullender,
[9] M. Ahamad, P.W. Hutto, G. Neiger, J.E. Burns, and P. editor, Distributed Systems, pages 199-216. ACM Press,
Kohli., “Causal Memory:Definitions, implementations 1993.
and Programming,” TR GIT-CC-93/55, Georgia Institute
of Technology, July 94. [19] V. Agarwal, Fault Tolerance in Distributed Systems,
Institute of Technology Kanpur,
[10] H.P. Reiser, M.J. Danel, and F.J. Hauck., “ A flexible www.cse.iitk.ac.in/report-repository, 2004. ,
replication framework for scalable andreliable .net
services.,” In Proc. of the IADIS Int. Conf. on Applied [20] H. Jung, D. Shin, H. Kim, and Heon Y. Lee, “Design and
Computing, volume1, pages 161–169, 2005. Implementation of Multiple FaultTolerant MPI over
Myrinet (M3) ,” SC|05 Nov 1218,2005, Seattle,
[11] A. Kale, U. Bharambe, “Highly available fault tolerant Washington, USA Copyright 2005 ACM.
distributed computing using reflection and replication,”
Proceedings of the International Conference on [21] M. Elnozahy, L. Alvisi, Y. M. Wang, and D. B. Johnson.
Advances in Computing, Communication and Control, A survey of rollback-recovery protocols in message
Mumbai, India Pages: 251-256 ,: 2009 passing systems. Technical Report CMU-CS-96-81,
School of Computer Science, Carnegie Mellon
[12] X. China, “Token-Based Sequential Consistency in University, Pittsburgh, PA, USA, October 1996.
Asynchronous Distributed System ,” 17 th Internaional
Conference on Advanced Information Networking and [22] J. Walters and V. Chaudhary,” Replication-Based Fault
Applications (AINA'03),March 27-29, ISBN: 0-7695- Tolerance for MPI Applications,” Ieee Transactions On
1906-7 Parallel And Distributed Systems, Vol. 20, No. 7, July
2009.
[13] Sanjay Bansal, Sanjeev Sharma, Ishita Trivedi, “A
Detailed Review of Fault-Tolerance Techniques in [23] M Chtepen, F.. Claeys, B. Dhoedt, , and P.
Distributed System”, International Journal on Internet Vanrolleghem,” Adaptive Task Checkpointing and
and Distributed Computing Systems. Vol: 1 No: 1 : Replication:Toward Efficient Fault-Tolerant Grids”,
2011 IEE Transactions on Parallel and Distributed Systems,
Vol. 20, No. 2, Feb 2009.
[14] D.K. Gifford, “Weighted voting for replicated data,” In
SOSP ‟79: Proc. of the seventh ACM symposium on [24] S. Jafar, A. Krings, and T. Gautier,” Flexible Rollback
Operating systems principles, pages 150–162, 1979. Recovery in Dynamic Heterogeneous Grid Computing”,
IEEE Transactions On Dependable and Secure
[15] J. Osrael, L. Froihofer, K.M. Goeschka, S. Beyer,P. Computing, Vol. 6, No. 1, Jan-Mar 2009.
Gald´amez, , and F. Mu˜noz. “A system architecture for
21