Design of a 1ransaction Recovery Instance Based on

Bi-directional Ring Election Algorithm for Crashed Coordinator in Distributed
Database Systems
Dharavath Ramesh
*
, Member, IEEE K Chiranjeev Kumar
a
, Member, IEEE ana B Ramji
b

Department of CSE, Indian School of Mines
*, a
, Dhanbad, S V E C
b
, Suryapet, India.
e-mail:rameshd.ism@gmail.com, k_chiranjeev@yahoo.co.uk,ramji143@gmail.com


Abstract-In a distributed database environment, when the
coordinator site (root node or process) is not working, the
environment needs to choose or elect a new one in order to
perform the transactional tasks. The elected coordinator takes
the lead to perform the activities as well and continues the
functioning. If the previous (crashed) site is recovered from the
failures then again it leads the system by taking the
responsibility. In this paper, a recovery instance based on bi-
directional ring election algorithm for the crashed coordinator
was brought up. The new algorithm for the recovered site
quickly brings the state back by sending messages in parallel
instances. This work shows that how the algorithm makes the
recovered site faster and takes less time to make the system
quickly to handle transactions normally.
Keywords-Distributed database systems; coordinator; site
recovery; bi-directional ring; transaction.
I. INTRODUCTION
URRENTLY many distributed database commit
protocols require one of the site (node or process) to act
as coordinator to perform transactional related activities.
The implementation of the commit primitive is the most
difficult and expensive [1]. The difficulty originates from
the fact that the correct commitment of a distributed
transaction requires that its entire sub transactions commit
locally even in the case of failures.
In order to perform functions at different sites, a distributed
application has to execute several processes at these sites.
We call these processes the participants or agents of the
application. There exists a coordinator or root agent which
starts the whole transaction, so that when the user requests
the execution of an application, the coordinator is started;
the site of the coordinator is called the site of origin of the
transaction. Several traditional leader election algorithms
are classified such as spanning tree algorithm [8], bully
algorithm [3], ring algorithm [4], modified bully algorithm
[5], synchronous election algorithm [6], [10], [11] and so
on. These all put their efforts to elect a new coordinator in
the presence of the present coordinator crash. But no such
algorithm has depicted the scenario when the crashed
coordinator recovers from its crash and leads the system
normally for functioning. In this paper, first we focus on
distributed site failure scenario with respect to transactions,
second how the bi directional ring algorithm works in order
to select the new coordinator or leader and also gives the
methodology for recovered coordinator to get back its
functionalities, and third how the failed site recovery
instance is managed. Election of new coordinator is depicted
by Nakano [7], Park [8], and pen [9] but, they have not
triggered on recovered coordinator. In this paper, we show
how the bi-directional ring based election makes the
possibility for the recovered one to get back its position as
well.
In order to cooperate in the execution of the global
operation required by the application, the participated nodes
or agents have to communicate. As they are resident at
different sites, the communication between participated
nodes is performed through messaging. The following is the
basic theme for election algorithm based on ring topology
[2];
Generally the election of coordinator happens in two
modes of phases;
(1) Election: select the process or node as the coordinator
with highest process ID;
(2) Deciding phase: pass the information to other processes
that who is the new coordinator in the above phase.
Election phase:
When a process (say P
i
) sends a request message to
the current coordinator and does not receive reply
with in a fixed amount of time period, it assumes
that the coordinator has crashed. Therefore, it
initiates an ELECTION message containing its ID
and sends to its successor (actually to the first
successor that is currently active).
If the successor is down then it simply bypass the
message to the other until a running process is
found. On receiving the ELECTION message, the
successor appends its own ID to the message and
passes it on to the next active member in the ring.
In this manner, the ELECTION message circulates
over the ring from one active process to another
and eventually returns back to process P
i.
Process
P
i
recognizes the message as its own election, it
elects the process having the highest ID number as
the new coordinator.
C
721 978-1-4673-4805-8/12/$31.00 c 2012 IEEE
Decide Phase:

Process P
i
circulates a coordinator message over the ring
to inform all the other active processes that who the new
coordinator is. When a coordinator message comes back to
process P
i
after completion of its round along the ring, it is
removed by process P
i
. At this point all the active processes
of sites know who the current coordinator is. Figure 1
depicts that the idea of the basic election algorithm based on
ring topology.


Fig 1. Normal functioning of ring algorithm

A. Methoaology of Ring Basea Election

Process 3 discovers that the current coordinator (process
7) has crashed. It takes the possibility to become a
coordinator. In view of this it prepares the ELECTION
message and sends to other processes which are active in the
ring. After completion of the round process 3 founds that,
the higher priority (ID) processes 4, 5 and 6 are active. So
process 3 does not take any further action, it simply sends
the current coordinator status to remaining process in the
ring. When the coordinator message has gone around the
ring again, it will discard and everyone goes back.

Consider that there are n processes or sites in the ring. The
number if messages sent by the n processes in both phases
are
n + (n-1) ¬ 2n -1 (1)

II. FAILURE SCENARIO
A. Distributea Site Failure With Respect to Transaction
Consider a fund transfer case for making reservation,
which makes the environment to communicate two sites as
well.
Scenario (1): A site X tries to book the ticket by receiving
the acknowledgement from site Y (where Y is the Bank
site and performs debit amount transaction).
Scenario (2): The debit transaction reads and writes the
data item at site Y with regard to X and it is committed at
Y.
Result scenario:
When a message is sent from site X to Site Y, we require
from a communication network the following behavior.
Site X receives a positive acknowledgement after a
delay which is < some max delay.
The message is delivered at Y in proper sequence
with respect to X-Y messages.
Site Y executed the debit transaction but site X
might not receive the acknowledgement with the
message being delivered.
Here we consider two events sender and deliver as the
primitives between site X and Y. A sender primitive
manages the message delivery from one to other, where
deliver primitive manages the transaction status at the own
site (like scenario (2)).

Transaction as follows:

Sender MESSAGE SITE
Deliver SITE MESSAGE
Completed transSITE

Summary of the above transaction is depicted in Fig 2 with
respect to sender and deliver primitives.

Send (S SITE, M MESSAGE, T TRANSACTION)
When message aelivery of message (senaer)
´ TTrans
´ Sitetransstatus (T) (coorainator (T)) ÷ penaing
´ S ÷ coorainator (T)
´ T ao (tranaebit) ´ ao (transeffect (t) ÷ {O}
Then senaer. ÷ senaer U {M S}
[[ Debit. ÷ aebit U {M}
[[ Transaebit. ÷ transaebit {MT}
End;

Deliver (S SITE, M MESSAGE)
When message aelivery of message (senaer)
´ (SM) aeliver
Then aeliver. ÷ aeliver {SM}
End;

Fig 2. Primitives between site X and Y

As shown above, the coordinator site X of debit
transaction T issues a debit message M after submission of
the transaction T at the site Y. The term T Trans
indicates that a transaction T has started. Similarly, the term
T do (transdebit) indicates that a debit message
corresponding to the transaction T has not been sent.

722 2012 World Congress on Information and Communication Technologies
B. Strategy of a Transaction
Fund Transfer (Debit operation at site Y):

Read (SITE Y, $AMOUNT, $FROM¸ACC, TO¸ACC)
Begin¸T
Select AMOUNT into $FROM¸AMOUNT
From ACCOUNT
Where ACCONT¸NUMBER÷$FROM¸ACC,
If $FROM¸AMOUNT - $AMOUNT · 0 then abort
Else begin
Debit ACCOUNT
Set AMOUNT÷AMOUNT - $AMOUNT
Where ACCOUNT ÷ $FROM¸ACC,
Commit
Ena
(a) Transaction at the global level

COORDINATOR (ROOT NODE):

Read (SITE, $AMOUNT, $FROM¸ACC, TO¸ACC)
Begin¸T
Select AMOUNT into $FROM¸AMOUNT
From ACCOUNT
Where ACCONT¸NUMBER÷$FROM¸ACC,
If $FROM¸AMOUNT - $AMOUNT · 0 then abort
Else begin
Debit ACCOUNT
Set AMOUNT÷AMOUNT - $AMOUNT
Where ACCOUNT ÷ $FROM¸ACC,
Create PARTICIPANT SITE,
Sent to PARTICIPANT SITE ($AMOUNT, $TO¸ACC),
Commit
Ena

PARTICIPANT SITE:

Receive from COORDINATOR ($AMOUNT, $TO¸ACC),
Update ACCOUNT
Set AMOUNT÷AMOUNT·$AMOUNT
Where ACCOUNT÷$TO¸ACC,

(b) Transaction constituted by two agents

Fig 3. FUND_TRANSFER APPLICATION

If we assume the accounts are distributed at different
sites (X and Y) of a network, at execution time the
transaction will be performed by several participating sites
(agents). When the COORDINATOR needs the execution
of the participated site, it issues create PARTICIPANT
SITE primitive; then it sends the parameters to that site. Site
Y commits the transaction and crashes i.e. the account is
debited at site Y, but the ticket is not booked at site Y. Then
how the site Y recovers the instance of the debit transaction
case after recovery? If it is recovered instantly then how the
debit transaction case is changed in to the credit transaction
case immediately? It should be done with minimum effect.
III. BI-DIRECTIONAL APPROACH (BEA)
After the coordinator site crash, the time taken to choose a
new coordinator should be less as possible in distributed
environment. Otherwise, the computational efficiency of the
sites in the network environment will be reduced as well as
the whole performance of the distributed system. The basic
election criteria on the ring topology considered messages in
one -way, while bi-directional election based on the ring
sends messages in parallel so it could elect the new
coordinator much faster. This bi-directional approach
works as follows:
When any process observes that the coordinator is not
responding for their request with in the stipulated time, it
takes the initiation for ELECTION. We assume process P
i

found that the COORDINATOR has crashed. Then the
election if initiated by process P
i
and the ELECTION phase
starts as follows:
Process P
i
sends an ELECTION message to its two
adjacent successors, j and k, at the same time unit.
The message contains ID of the process Pi.
Process Pj and Pk receives ELECTION message
which is sent by process Pi and compares their own
ID with the ID of process Pi. If ID (Pj) < ID (Pi),
then process Pj sends the message to its successor;
If ID (Pj) > ID (Pi), then process Pj sends the
message with its own ID to its successor. Similarly,
process Pk also follows the same criteria.
When a last process Pm receives two ELECTION
messages, it compares its own ID with the ID
contained in the ELECTION messages which it
received. The highest ID will win and the
ELECTION phase terminates. At this time process
Pm is no longer sending ELECTION messages to
its successor.
Deciding phase (sending COORDINATOR message):

Process Pm sends a COORDINATOR message
with highest ID to its successors.
Any process which receives a COORDINATOR
message it compares its own ID with ID of the
COORDINATOR. If both the ID’s are equal then
the present process will become the new
COORDINATOR.
When a last process in the ring receives two
COORDINATOR messages, it will not send the
messages to its successors, with this the deciding
phase ends. Now all processes in the ring know the
COORDINATOR, they go back to their normal
activities.
2012 World Congress on Information and Communication Technologies 723
C. Description of the Election Algorithm

Election phase:

begin if Pi is active & taken initiation then
while (receive(election, Pm))
if(flag ==0) // process receives message
first time
x = max(Pi, Pm);
send (selection , x) to Next Successor; // sending
the message from the initiate process
flag =1;
else if (flag ==1)
x = max(x,Pm)
Send (COORDINATOR, x) to successors;
flag = 0;
begin receive (election, Pm);
send (election , Pm) to Next Successor;
end
end

Deciding phase:

begin while (receive(COORDINATOR, Pm))
if (flag == 1)
(if P == Pm) state P as the LEADER;
else state P = lost;
send ( COORDINATOR, Pm) to next successor
else return;
end


D. Description of the Algorithm

We consider two different functionalities with respect to the
above:

(1) Election phase with even number of processes and odd
number of process.
(2) Deciding phase with even number of processes and odd
number of processes.

For the above two functionalities we assume that the ring is
having n and (n-1) sites or processes with time units T
0
to
run the program T
1
to communicate between processes. The
total time taken to complete the ELECTION at each site is
T = T
0
+T
1
, where T is a time unit.







Analysis of ELEC1IOA phase:

(1) Even number of processes:

(a) Process 2 initiates the ELECTION; Process 3 receives two messages
and decides the CORDINATOR

(2) Odd number of processes:



(b) Process 2 initiates the ELECTION; 4 and 6, receives two ELECTION
messages and decides the CORDINATOR. ELECTION phase ends.

Fig 4. Analysis of ELECTION phase

Analysis of deciding phase:

(1) Even number of processes


(a) Process 3 sends two COORDINATOR messages, 2 receives two
COORDINATOR messages. Deciding phase ends.




724 2012 World Congress on Information and Communication Technologies
(2) Odd number of Processes



(a) Processes 4 and 6 sends two COORDINATOR messages, 2 receives
two COORDINATOR messages. Deciding phase ends.

Fig 5. Analysis of deciding phase
In this manner the BAA works in order to elect a new
COCRIDNATOR. This is the methodology stated in [7] but
the constraint here is, when a crashed COORDINATOR
recovers the crash, again in initiates for ELECTION
process. Here we expose the methodology in terms of
failure transaction with respect to the crashed
COCRDINATOR in both cases like failure and recovery.

IV. RECOVERY INSTANCE FOR FAILURE SCENARIO

After recovery of the crashed COORDINATOR, it has to
write the debit transaction instance as the update transaction
by sending the message to its participant process or site. The
issue of a write transaction of crashed COORDINATOR is
as follows:

Write Trans (T TRANSACTION)
Any message)
´ T Trans
´ {MT} Transypaate
´ (coorainator (T) M) aeliver
´ (coorainator (T) T) activeTrans
´ ao (Transeffect (T)) ÷ {O}
´ T ao (transcreait)
Then activeTrans. ÷ activeTrans U {COORDIN (T) T}
[[ Sitetranstatus (T) (COORDIN (T)). ÷ precommit

End;

Before the crash the COORDINATOR wrote the
transaction status in its log record. After recovery it pre
commits the transaction which it had been left. In the case
of site X the ticket information is not relevant to the user.
But, at site Y it has to pre commit its status. So the debit
transaction case is reversed and the balance will be updated
according to the above transaction scenario.

We examined this recovery instance in the case of both
odd process participation and even process participation
occurrence with T time units which were less enough. The
recovery instance in the case of time units for the n
processes is shown in table 1.
A. Measurement of Time Units after Recovery
Theorem: The time taken in ELECTION phase is (n/2) T
(1) In the case of even number of processes, as shown in
figure 4 (a).
(a) Process 2 sends two sends ELECTION messages to its
successors, which costs T;
(b) Process 3 receives two ELECTION messages at the
same time unit after 3T.

Therefore, for n nodes it takes,

1
T =T+T [(n-2)/2] = n T/2 (1)
The number of ELECTION messages is:
2 + (n-2) = n (2)
(2) In the case of odd number of processes, as shown in
figure 4 (b)

1
T = T + [(n-1)/2] T = (n+1) T/2 (3)
The number of ELECTION messages is:
2 + (n-1) = n+1 (4)


(a) Process 6 initiates the ELECTION; Process 5 handovers the primitives
to process 6 and becomes COORDINATOR

(b) Process 7 initiates the ELECTION; Process 6 handovers the primitives
to process 7 and becomes CORDINATOR

Fig 6. Recovered Coordinator in odd and even cases.



2012 World Congress on Information and Communication Technologies 725
TABLE I
Time units (T) with respect to runtime and the no. of
messages

Algorithm Number of messages Time spent ( Units)

Odd

Even

Odd

Even
Bi-direc
Ring Elec

2n+2

2n

n

n

V. RESULT ANALYSIS
We examined the two conditions of the transaction at the
time of new COORDINATOR election and at the time of
recovered ELECTION of the COORDINATOR. In these
two cases the mat lab outcomes are pictured with respect to
the source outcome. A simulated instance is shown in fig 7
(a) and (b). The time unit of election at the site X is 0.001at
the time of recovered election is 0.0001. After recovery the
transaction at site Y updated the account which it had done
at the time of crash.

(a) Process elects new CORDINATOR when it finds the present process
has crashed (In the case of Even and Odd ring)

(b) Recovered CORDINATOR will become the new COORDINATOR by
taking the functionalities back from the current COORDINATOR.

Fig 7. Simulation occurrence for newly elected COORDINATOR and
recovered one in the case of even and odd turns.
CONCLUSION
The above said is based on ideal instances, in the real
scenario, the individual performance of sites and network
condition may differ, and the convenient situation is that
two successor nodes to know new COORDINATOR ID in
the first. This is somehow great, our current methodology is
to reduce the time and frequency, further improve of the
system environment. Implementation of transactional part
with regard to above methodology takes some hard bounds
even the environment is free from noise. We will further
improve the update scenario of the transaction with some
optimum time factor which gives the optimality in both
crash and recovery of the distributed transaction at different
sites of the distributed database field.

REFERENCES

[1] Stefano Ceri and Giuseppe, Distributea Databases Principles ana
Systems. India: Tata McGraw-Hill, 2008, pp. 176 -178.

[2] Andrew S Tanenbaum, Distributea Systems Principles ana
Paraaigms, India: Pearson Prentice Hall, 2005 pp. 287-289.

[3] N. Wu, Y.Z.Ma, “Optimal Algorithm Based on Bully Algorithm ,”
Computer Engineering, vol.34, no.19, pp.118–120, October.2008.

[4] Pradeep K Sinha, Distributea Operating Systems Concepts ana
Design, India: IEEE press (Prentice Hall), 2005 pp 331-333.

[5] Sharifiloo, A leaaer election algorithm for clusterea groups, ICIIS
Pp.1-4, August 2007.

[6] G.Zhang, J.Chen, Y.Zhang, “Research of Synchronous
LeaderElection Algorithm on Hierarchy Ad Hoc Network,” Computer
Simulation, vol.27, no.3, pp.123–127, March.2010.

[7] Nakano, Koji, and Olariu, Stephan, "Uniform leader election
protocols for Radio networks", IEEE Trans. On Parallel and
Distributed Systems, Vol. 13, No. 5, pp. 516-526, May 2002.

[8] .Park, Sung-Hoon, Kim, Yoon, and Hwang, Jeoung sun, "An efficient
algorithm for leader election synchronous distributed systems", IEEE,
pp. 1091-1094, 1999.

[9] Pen, Yi and Singh, Gurdip, "A Fault-Tolerant protocol for election in
chordal-ring networks with fail-stop processor failures", IEEE Trans.
on Reliability, Vol. 46, No. 1, pp. 11-17, March 1997.

[10] Francis, P. and Saxena, S., "Optimal distributed leader election
algorithms for synchronous complete network", IEEE, pp. 86-88,
1998.

[11] Stoller, Scott D., "Leader election in asynchronous distributed
system", IEEE Trans. on Parallel and Distributed Systems, Vol. 49,
No. 3, pp. 283-284, March 2000.




726 2012 World Congress on Information and Communication Technologies