You are on page 1of 41

Mutual Exclusion for DS

Distributed Mutual Exclusion Algorithms
Two Approaches:

(1) Token-based approach (2) Permission-based approach
Permission-Based Approach: (1) Centralized Algorithm (2) Fully Distributed Algorithm

Centralized Algorithms

Centralized Algorithm:
1. One process is elected as the coordinator. 2. Whenever a process wants to enter a Critical Section, it sends a REQUEST message to the coordinator asking for permission. 3. If no other process is currently in CS, the coordinator sends back a REPLY granting permission. 4. When the process finishes execution of CS, it sends a RELEASE message to the coordinator.

Centralized Algorithms
1. GUARANTEES mutual exclusion and it is also FAIR. 2. NO STARVATION. 3. Only 3 messages per entry are required to enter the CS: REQUEST, GRANT, and RELEASE.

1. If the coordinator crashes, the entire system may go down. 2. There is no distinction between a dead coordinator and “permission denied”, since in both cases no messages come back to the requesting process. 3. The single coordinator can become a performance bottleneck.

(See below for this decision. >> After exiting the CS. . queuing incoming message and deferring them. the process sends REPLY messages to all its deferred requests. TS)” to all processes in the system. >> On receiving a REQUEST message.Fully Distributed Algorithm (FDA) >> When process Pi wants to enter the CS. or it may defer sending a REPLY back.) >> A process that has received a REPLY message from all other processes in the system can enter the CS. it generates a new TS (timestamp) and sends the message “REQUEST (Pi. a process Pj may reply immediately by sending a REPLY message back to Pi.

>> If Pj wants to enter the CS but has not yet entered it. >> If Pj does not want to enter the CS. If Pj’s TS > Pi’s TS (meaning that Pj is younger). then it defers its REPLY to Pi.Decision DECISION on whether Pj replies immediately to a REQUEST (Pi. Otherwise. it defers the REPLY. TS) or defer its REPLY: >> If Pj is in the CS. then it compares its own REQUEST TS (timestamp) with the TS of the incoming message made by Pi. . then it sends a REPLY immediately to Pi. then it sends a REPLY immediately.

Fully Distributed Algorithm (FDA) (1) Mutual exclusion is obtained. since entry to the CS is scheduled based on the TS ordering. . (3) Freedom from starvation is ensured. (4) The number of messages per entry to CS is 2(n-1). (2) Freedom from deadlock is ensured. where n is the number of processes in the system.

then the entire scheme collapses. If one process fails.Disadvantages with FDA (1) The process need to know the identity of all other processes in the system. then all other processes are notified. so that they will no longer send REQUEST messages to the failed process. Resolution: By continuously monitoring the state of all processes in the group. When a process recovers.The new process must receive the names of all other processes in the group.The name of the new process must be distributed to all the other processes in the group. . When a new process joins the group of processes participating in the mutual exclusion algorithm: -. -. (2) If one process fails. it must initiate the procedure that allows it to rejoin the group.

it passes the token to its neighbor. a new logical ring must be established. it may enter the CS. (2) If a process fails. the token is passed around again. >> If the process receiving the token does not want to enter the CS. >> If the ring is unidirectional. keeping the token.Token-Based Algorithm for a Ring Structure >> When a process receives the token. The physical communication network need not be a ring. freedom from starvation is ensured. >> We assume that the processes in the systems are “logically” organized in a ring structure. an election must be called to generate a new token. >> The number of messages per entry to CS varies from 1 to infinite. >> Two types of failure: (1) If the token is lost. As long as the processes are connected to one another. >> After the process exits the CS. . it is possible to implement a logical ring.

Abstract Ring p2 p3 p2 p4 p6 p1 p3 p1 p4 p2 p5 p5 p6 p5 token (a) System graph (b) Abstract ring .

Token-based algorithm for a ring topology 1. b. it forwards the message along its out-edge. if it is not in CS. 2. If it is in CS. Pi) and sends it along its out-edge. request queue. A process Pi wishing to enter the CS sends a request message (“request”. Pi). . When a process not possessing the token receives a request message. it forms the message (“token”. When a process possessing the token receives the message (“request”. 3. it performs: a. it enters Pi in the request queue associated with the token. Pi) along its outedge and blocks itself.

If Pj is not equal to Pi. It now becomes active and enters the CS. request queue). it removes the first process id from the queue. 5. When a process completes execution of the CS. it checks if the request queue is empty.) 4. Let this id be Pi. it checks if Pj = Pi. If not. When a process Pj receives the message (“token”. Pi.Token-based algorithm (conti. If so. it creates a local data structure to store the token and copies the request queue from the message. Pi. . it forwards the message along its out-edge. request queue) and sends it along its out-edge. It now forms a message (“token”.

Pj) where Pj is the parent of Pi in the tree. (3) Each process Pi other than Phold has exactly one outgoing edge (Pi. >> A path from Pi to Phold exists in the system for every Pi other than Phold.Raymond’s Algorithm for an Abstract Inverted Tree >> Phold designates the process in possession of the privilege token. (2) Each process in the system belongs to the tree. >> The algorithm requires O(log n) messages per entry to CS. >> The Algorithm maintains three invariants: (1) Process Phold is the root of the tree. where n is the number of messages in the system. .

Abstract Inverted Tree p2 p4 p6 p1 p1 p5 p3 p3 p5 p6 p2 p4 (a) System graph (b) Abstract inverted tree .

(4) A process receiving the token performs following: > If its own request is at the top of the local queue. Reverse the tree edge (Pi. d. remove it from the queue and enter CS. send a REQUEST containing its own id to Pi. which receives a REQUEST from another process. performs following actions: > Put the id of the requester in its local queue. Send the token to Pi. if Pr is not equal to Phold. . If the local queue is not empty. Phold). (3) On completing the execution of a CS. c. > Else. Phold performs following: a. Let it be Pi. perform the four actions in steps (3). > Send a REQUEST containing its own id on its outgoing edge. Remove the process at the head of the local queue.Raymond’s Algorithm for Mutual Exclusion (1) A process Pi wishing to enter a CS enters its own id in the local queue. b. it sends a REQUEST message containing its own id on its outgoing edge. Also. (2) A process Pr.

removes the P3 from its local queue. Figure (c) shows the situation after the requests made by P4 and P1 have reached P5. (3)c. >> P4 now enter s CS. P5). which enables P1 to enter the CS. The result is shown in Figure (e). P5 now sends a request to P3 since its local queue is not empty. Note that this would not have been possible if Step 3(d) had not been executed. >> P3 performs similar actions (see Step 4). . reversal of edge (P4. the holder of the token (see Steps (1) and (2) of Raymond Algorithm). (3)d. it (see Step (3)) (3)a. After P4 completes the CS. The result is shown in Figure (d). (3)b. passes the token to P3. reverses the edge (P3. which result in sending the token to process P4. >> When process P5 releases the CS. the token is transferred to process P1 via P3 and P5 in an analogous manner. P3) and sending of a request by P3 to P4.An Example of Raymond’s Algorithm >> The following Figure illustrates operation of Raymond’s algorithm when processes P4 and P1 make requests to enter CS.

Illustrating Example p3 p1 p5 p1 p5 p1 p1 p3 p4 p1 p1 p3 p4 p5 p6 p2 p4 p4 p6 p2 p4 p4 (c) (d) .

Illustrating Example p1 p5 p1 p5 p1 p1 p3 p4 p5 p1 p1 p3 p5 p6 p2 p4 p4 p6 p2 p4 p3 (d) (e) .

>> Election algorithms assume that a unique priority number is associated with each “active” process in the system. >> The coordinator is always the process with the highest priority number. Further.ELECTION Algorithm >> The algorithms that determine where a new copy of the coordinator should be restarted if the coordinator fails. . assume the priority number for Pi to be i.

then at any time during execution.The Bully Algorithm (a) Pi sends an ELECTION message to each process with a higher priority number.) (c) If an answer is received. it may receive one of the following TWO messages: (1) Pj is the new coordinator (j > i). Pi sends a RESPONSE to Pj and begins its own ELECTION. (Pi informs all active processes with priority numbers less than i that Pi is the new coordinator. . NOTE: If Pi is not the coordinator. (2) Pj has started an ELECTION (j < i). provided that Pi has not already initiated such an election. (b) If no response is received within time T. Pi has done its election. Pi assumes that all processes with numbers greater than i have failed and elect itself the NEW coordinator.

P1 and P4 fail.) d.An Example a. responds to P2. P2 determines that P4 has failed by sending a REQUEST that is not answered within time T. all processes are active. and begins its own ELECTION by sending an ELECTION message to P4 (P3 does not know P4 has failed. P4 is the coordinator.) . (What if no message is received within a time interval T’? This means P3 failed after it sent a RESPONSE message to P2. Assumes the system consisting of processes P1 through P4. P3 receives the REQUEST. P2 receives P3’s RESPONSE and begins waiting to receive a message informing it that a process with a higher priority number has been elected.) c. and initially. b. P2 should RESTART the ELECTION algorithm. P2 then begins its ELECTION by sending a REQUEST to P3 (This is the only “active” process with priority number greater than p2.

) f. P3. (P1 does not receive the message.(conti. and P4. P4 does not respond within an interval T. P2. Finally. and P3 that it is the current coordinator. g. it sends an ELECTION message to P2. P2 and P3 respond to P1 and begin their own ELECTION. since it has failed. when P1 recovers. (Note: P4 sends no ELECTION requests. Later. P4 recovers and notifies P1. though the same events as before.) .) e. so P3 elects itself the new coordinator and informs P1 and P2. P3 will again be elected. since it is the process with the highest number in the system. h.

Another Example (a) Process 4 holds an election. Distributed Systems: Principles and Paradigms. telling 4 to stop. 2e. (b) Processes 5 and 6 respond. All rights reserved. (c) 2007 Prentice-Hall. (c) Now 5 and 6 each hold an election. 0-13-239227-5 . Tanenbaum & Van Steen. Inc.

Another Example (Conti) (d) Process 6 tells 5 to stop. All rights reserved. 0-13-239227-5 . (e) Process 6 wins and tells everyone. (c) 2007 Prentice-Hall. Tanenbaum & Van Steen. Distributed Systems: Principles and Paradigms. Inc. 2e.

and sends the message to its successor. so each process knows who its successor is. >> When a process notices that the coordinator is not functioning. >> At this point. the message type is changed to COORDINATOR and circulated once again. it is removed and everyone goes back to work. the sender adds its own process number to the ACTIVE LIST in the message effectively making itself a candidate to be elected as coordinator. >> Eventually. . this time to inform everyone else who the coordinator is and who the members of the new ring are.The Ring Algorithm The RING Algorithm for a system where processes are organized as a Ring: >> A ring can be logically or physically. >> At each step along the way. it builds an ELECTION message containing a new ACTIVE LIST with its number being the only number. until a running process is located. or the one after that. the message gets back to the process that started it all. That process recognizes this event when it receives an incoming message containing its own process number in the ACTIVE LIST. When this message has circulated once.

Election Algorithm Using a Ring • Figure [5.0] 0 previous coordinator has crashed 1 2 Election message [2] 7 [5.6.3] no response 6 [5] 5 4 .6] 3 [2.

. Eventually.Explanation for the above Figure What happens if two processes. When both have gone around again. with exactly the same members and in the same order. say 2 and 5. independent of the other one. It does no harm to have extra messages circulating. has crashed? Answer: Each of these builds an ELECTION message and each of them starts circulating its message. both messages will go all the way around. both will be removed. but this not considered wasteful. process 7. at worst it consumes a little bandwidth. discover simultaneously that the previous coordinator. and both 2 and 5 will convert them into COORDINATOR messages.

>> When a node receives an ELECTION for the first time. called the source.. . can initiate an election by sending an ELECTION message to its immediate neighbors (i. and subsequently sends out an ELECTION message to all its immediate neighbors. >> Vasudevan (2004) proposed a solution that elects a best leader rather than just a random for a wireless ad hoc network. which are not realistic in wireless environments. >> Any node in the network. except for the parent. the nodes in its range).e. it designates the sender as its parent.Elections in Wireless Environments >> The Bully and Ring Election Algorithms are based on assumptions that message passing is reliable and the topology of the network does not change.

>> This waiting has an important consequence: 1st: Note that neighbors that have already selected a parent will immediately respond to R. 2nd: If all neighbors already have a parent. it merely acknowledges the receipt. it will also report information such as its battery lifetime and other resource capacity.) >> When a node receives an ELECTION message from a node other than its parent. In doing so. R is a leaf node and will be able to report back to Q quickly. .(conti. >> When node R has designated node Q as its parent. it forwards the ELECTION message to its immediate neighbors (excluding Q) and waits for acknowledgements to come in before acknowledging the ELECTION message from Q.

) >> This information will later allow Q to compare R’s capacities to that of other downstream nodes. Q had sent an ELECTION message only because its own parent P had done so as well.(conti. >> Of course. and select the best eligible node for leadership. . the source will eventually get to know which node is best to be selected as leader. it will pass the most eligible node to P as well. >> In this way. >> In turn. when Q eventually acknowledges the ELECTION message previously sent by P. after which it will broadcast this information to all other nodes.

(b)–(e) The build-tree phase Tanenbaum & Van Steen. Distributed Systems: Principles and Paradigms. All rights reserved. 0-13-239227-5 . (c) 2007 Prentice-Hall. (a) Initial network. with node a as the source.Elections in Wireless Environments (1) Election algorithm in a wireless network. Inc. 2e.

2e.Elections in Wireless Environments (2) Election algorithm in a wireless network. (c) 2007 Prentice-Hall. (a) Initial network. with node a as the source. Distributed Systems: Principles and Paradigms. All rights reserved. 0-13-239227-5 . Inc. (b)–(e) The build-tree phase Tanenbaum & Van Steen.

Elections in Wireless Environments (3) Figure 6-22. (f) Reporting of best node to source. Distributed Systems: Principles and Paradigms. Inc. 0-13-239227-5 . (c) 2007 Prentice-Hall. 2e. Tanenbaum & Van Steen. (e) The build-tree phase. All rights reserved.

we can use the resourceordering deadlock-prevention technique by simply defining a global ordering among the system resources. and allowing each process to request a resource (at any site) with unique number i only if it is not holding a resource with a unique number greater than i.Deadlock Handling for DS • Deadlock prevention algorithms presented in a centralized system can be used in a distributed system. . • For example.

Otherwise. Pi is allowed to wait only if it has a smaller timestamp than does pj (i..A deadlock-prevention scheme based on timestamp ordering with resource preemption • Each process in the system is assigned a unique timestamp when it is created. • The wait-die scheme: Based on a nonpreemptive technique When process Pi requests a resource currently held by Pj. Pi is older than Pj). Pi is rolled back (dies). .e.

(conti. Pi is allowed to wait only if it has a larger timestamp than does pj (i. Pj is rolled back (Pj is wounded by Pi.) .. Pi is younger than Pj). Otherwise.) • The wound-wait scheme: Based on a preemptive technique  When process Pi requests a resource currently held by Pj.e.

• With the wound-wait scheme. if P1 requests a resource held by P2. If P3 requests a resource held by P2. if P1 requests a resource held by P2. then the resource will be preempted from P2. If P3 requests a resource held by P2. P3 will be rolled back. 10. respectively. and P2 will be rolled back. P1 will wait. p2.Example • Suppose that processes P1. . • With the wait-die scheme. and P3 have timestamps 5. and 15. then P3 will wait.

then Pi may reissue the same sequence of requests when it is restarted. Discussion . When Pi is restarted and requests the resource now being held by Pj. • wound-wait : Pi is wounded and rolled back because Pj has requested a resource it holds.• Both schemes can avoid starvation provided that. Pi waits. • wait-die: if Pi dies and is rolled back because it has requested a resource held by Pj. fewer rollbacks occur in the wound-wait scheme. If the resource is still held by Pj. • Differences: • wait-die: an older process must wait for a younger one to release its resource. Thus. then Pi will die again. it is not assigned a new timestamp. Pi may rollback (die) several times before acquiring the needed resource. • wound-wait : an older process never waits for a younger process. when a process is rolled back. Thus.

• To prevent unnecessary preemptions.Deadlock Detection • The deadlock-prevention algorithm may preempt resources even if no deadlock has occurred. • Problem in DS: How to maintain the wait-for graph? . a cycle in the wait-for graph represents a deadlock. • If we assume only a single resource of each type. we can use a deadlock-detection algorithm.

. a request message is sent by Pi to site S2. The edge Pi  Pj is then inserted in the local wait-for graph of site S2. • The fact that we find no cycles in any of the local wait-for graphs does not mean that there are no deadlocks.Wait-For Graph • Local wait-for graphs are constructed as usual. however. • When a process Pi in site S1 needs a resource held by process Pj in site S2. deadlock has occurred. • If any local wait-for graph has a cycle.

Example p1 p2 p2 p4 p5 p3 p3 Site S2 Site S1 p2 p1 p4 p5 p3 Global wait-for graph .