You are on page 1of 14

1. What is a message passing system? Discuss the desirable features of a message passing system. Answer: A process is a program in execution.

When we say that two computers of a distributed system are communicating with each other, we mean that two processes, one running on each computer, are in communication with each other. A distributed operating system needs to provide interprocess communication (IPC) mechanisms to facilitate such communication activities. A message passing system is a subsystem of the distributed operating system which shields the details of complex network protocols from the programmer. It enables processes to communicate by exchanging messages and allows programs to be written by using simple communication primitives such as send and receive. Features of a Message Passing System Desirable features of a good message passing system are: Simplicity Efficiency Reliability Correctness Flexibility Security Portability Simplicity The message passing system should be easy to use easy to develop new applications that communicate with the existing ones able to hide the details of underlying network protocols used Efficiency Should reduce the number of message exchanges (acks,..) Avoid the costs of establishing and terminating connections between the same pair of processes for each and every message Piggyback acknowledgments with the normal messages Send acknowledgments selectively Reliability Should handle node and link failures Normally handled by acknowledgments, timeouts and retransmissions. Should handle duplicate messages that arise due to retransmissions (generally sequence numbers of the messages are used for this purpose). Correctness Atomicity: messages sent to a group of processes will be delivered to all of them or none of them. Ordered delivery: Messages are received by all receivers in an order acceptable to the application. Survivability: Guarantees messages will be delivered correctly in spite of failures.


IPC protocols should be flexible to cater to the various needs different applications (i.e. some may not require atomicity others may not require ordered delivery, etc) IPC primitives should be flexible to permit any kind of control flow between cooperating processes, including synchronous and asynchronous send and receive. Security Message passing system should be capable of providing secure end-to-end communication. Support mechanisms for authentication of the receivers of a message by a sender. Support mechanisms for authentication of the sender by its receivers Support encryption of a message before sending it over the network. Portability: There are two different aspects of portability in a message-passing system: -- The message-passing system should itself be portable. It should be possible to easily construct a new IPC facility on another system by reusing the basic design of the existing message-passing system. -- 2. The applications written by using the primitives of the IPC protocols of the message-passing system should be made portable. This requires that heterogeneity must be considered while designing a message passing system. This may require the use of external data representation format for the communication taking place between two or more processes running on computers of different architectures.

2. Discuss the implementation of RPC Mechanism in detail. Answer: The RPC Model The RPC mechanism is an extension of a normal procedure call mechanism. It enables a call to be made to a procedure that does not reside in the address space of the calling process. The called procedure may be on a remote machine or on the same machine. The caller and callee have separate address space; so called procedure has no access to the callers environment. The RPC model is used for transfer of control and data within a program in the following manner: 1. For making a procedure call, the caller places arguments to the procedure in some wellspecified location. 2. Control is then transferred to the sequence of instructions that constitutes the body of the procedure. 3. The procedure body is executed in a newly created execution environment that includes copies of the arguments given in the calling instruction. 4. After the procedures execution is over, control returns to the calling point, possibly returning a result. When a remote procedure call is made, the caller and the callee processes interact in the following manner: The caller (also known as the client process) sends a call (request) message to the callee (also known as the server process) and waits (blocks) for a reply message. The server executes the procedure and returns the result of the procedure execution to the client. After extracting the result of the procedure execution, the client resumes execution. In the above model, RPC calls are synchronous; however, an implementation may choose to have RPC calls to be asynchronous to allow parallelism. Also, for each request the server can create a thread to process the request so the server can receive other requests.

A Model of Remote Procedure Call Implementation of RPC Mechanism To achieve the goal of semantic transparency, the implementation of RPC is based on the concept of stubs. Stubs provide a perfectly normal local procedure call abstraction. It conceals from programs the interface to the underlying RPC system. On the client side and the server side, a separate stub procedure is associated with each. To hide the existence of functional details of the underlying network, an RPC communication package (called RPC runtime) is used in both the client and server sides. Thus implementation of an RPC mechanism involves the following five elements: 1. The Client 2. The Client stub 3. The RPC Runtime 4. The server stub, and 5. The server

Implementation of RPC Mechanism

The job of each of these elements is described below: 1. Client: To invoke a remote procedure, a client makes a perfectly local call that invokes the corresponding procedure in the stub 2. Client Stub: The client stub is responsible for performing the following tasks: On receipt of a call request from the client, it packs the specification of the target procedure and the arguments into a message and asks the local runtime system to send it to the server stub. On receipt of the result of procedure execution, it unpacks the result and passes it to the client. 3. RPCRuntime: The RPC runtime handles the transmission of the messages across the network between client and server machines. It is responsible for retransmissions, acknowledgements, and encryption. On the client side, it receives the call request from the client stub and sends it to the server machine. It also receives reply message (result of procedure execution) from the server machine and passes it to the client stub. On the server side, it receives the results of the procedure execution from the server stub and sends it to the client machine. It also receives the request message from the client machine and passes it to the server stub. 4. Server Stub: The functions of server stub are similar to that of the client stub. It performs the following two tasks: The server stub unpacks the call receipt messages from local RPCRuntime and makes a perfect local call to invoke the appropriate procedure in the server. The server stub packs the results of the procedure execution received from server, and asks the local RPCRuntime to send it to the client stub. 5. Server: On receiving the call request from the server stub, the server executes the appropriate procedure and returns the result to the server stub.

3. Discuss the following with respect to Distributed Shared Memory: a. Memory Coherence (Consistency) Models b. Memory Consistency models c. Implementing Sequential Consistency d. Centralized Server Algorithm Answer: (a) Memory Coherence (Consistency) Models What is a memory Consistency Model? A set of rules that the applications must obey if they want the DSM system to provide the degree of consistency guaranteed by the consistency model. Weaker the consistency model, better the concurrency. Researchers try to invent new consistency models which are weaker than the existing ones in such a way that a set of applications will function correctly under the new consistency model. Note that an application written for a DSM that implements a stronger consistency model may not work correctly under a DSM that implements a weaker consistency model. (b) Memory Consistency models

i) Strict consistency: Each read operation returns the most recently written value. This is possible to implement only in systems with the notion of global time. So, this model is impossible to implement. Hence, DSM systems based on underlying distributed systems have to use weaker consistency models. ii) Sequential consistency: Proposed by Lamport (1979). All processes in the system observe the same order of all memory access operations on the shared memory. i.e., if three operations read(r1), write(w1) and read(r2) are performed on a memory address in that order, then any of the six orderings (r1,w1, r2), (r2,w1, r1), (w1, r2, r1).... is acceptable provided all processes see the same ordering. It can be implemented by serializing all requests on a central server node. This model is weaker than the strict consistency model. This model provides one-copy/single-copy semantics because all processes sharing a memory location always see exactly the same contents stored in it. Sequential consistency is the most intuitively expected semantics for memory coherence. So, sequential consistency is acceptable for most applications. iii) Causal consistency model: Proposed by Hutto and Ahamad (1990). In this model, all write operations that are potentially causally related are seen by all processes in the same (correct) order. For example, if a process did a read operation and then performed a write operation, then the value written may have depended in some way on the value read. A write operation performed by one process P1 is not causally related to the write operation performed by another process P2 if P1 has read neither the value written by P2 or any memory variable that was directly or indirectly derived from the value written by P2 and vice versa. For implementing DSMs that support causal consistency one has to keep track of which memory operation is dependent on which other operation. This model is weaker than Sequential consistency model iv) Pipelined Random - Access Memory (PRAM) consistency model This model was proposed by Lipton and Sandberg (1988). In this model, all write operations performed by a single process are seen by all other processes in the order in which they were performed. This model can be implemented easily by sequencing the write operations performed by each node independently. This model is weaker than all the above consistency models. v) Processor Consistency Model: Proposed by Goodman (1989). In addition to PRAM consistency, for any memory location, all processes agree on the same order of all write operations to that location. vi) Weak Consistency Model: Proposed by Dubois et al. (1988). This model distinguishes between ordinary accesses and synchronization accesses. It requires that memory become consistent only on synchronization accesses. A DSM that supports weak consistency model uses a special variable, called synchronization variable. The operations on it are used to synchronize memory. For supporting weak consistency, the following should be satisfied: All accesses to synchronization variables must obey sequential consistency semantics. All previous write operations must be completed everywhere before an access to synchronization variable is allowed. All previous access to synchronization variables must be completed before access to a non synchronization variable is allowed.

vii) Release Consistency Model: In the weak consistency model, the entire shared memory is synchronized when a synchronization variable is accessed by a process i.e.

All changes made to the memory are propagated to other nodes. All changes made to the memory by other processes are propagated from other nodes to the processs node.

This is not really necessary because the first operation needs to be performed only when a process exits from critical section and the second operation needs to be performed only when the process enters critical section. So, instead of one synchronization variable, two synchronization variables, called acquire and release have been proposed. Acquire is used by a process to tell the system that it is about to enter a critical section. Release is used to tell the system that it had exited critical section. If processes use appropriate synchronization accesses properly, a release consistency DSM system will produce the same results for an application as that if the application was executed on a sequentially consistent DSM system. viii) Lazy Release consistency model: It is a variation of release consistency model. In this approach, when a process does a release access, the contents of all the modifications are not immediately sent to other nodes but they are sent only on demand. i.e. When a process does an acquire access, all modifications of other nodes are acquired by the processs node. It minimizes network traffic.

(c) Implementing Sequential Consistency Sequential consistency supports the intuitively expected semantics. So, this is the most preferred choice for designers of DSM system. The replication and migration strategies for DSM design include: i) Non-replicated, non-migrating blocks (NRNMBs) ii) Non-replicated, migrating blocks (NRMBs) iii) Replicated, migrating blocks (RMBs) iv) Replicated, non-migrating blocks (RNMBs) i) Implementing under NRNMBs strategy: Under this strategy, only one copy of each block of the shared memory is in the system and its location is fixed. All requests for a block are sent to the owner node of the block. Upon receiving a request from a client node, the memory management unit (MMU) and the operating system of the owner node perform the access request and return the result. Sequential consistency can be trivially enforced, because the owner node needs to only process all requests on a block in the order it receives. Disadvantages: The serialization of data access creates a bottleneck. Parallelism is not possible in this strategy. Locating data in the NRNMB strategy: A mapping between blocks and nodes need to be maintained at each node. ii) Implementing under NRMBs strategy Under this strategy, only the processes executing on one node can read or write a given data item at any time, so sequential consistency is ensured. The advantages of this strategy include: No communication cost for local data access. Allows applications to take advantage of data access locality The disadvantages of this strategy include: Prone to thrashing Parallelism cannot be achieved in this method also Locating a block in the NRMB strategy:

1. Broadcasting: Under this approach: Each node maintains a owned blocks table When a block fault occurs, the fault handler broadcasts a request on the network. The node that currently owns the block responds by transferring the block. This approach does not scale well. 2. Centralized Server Algorithm: A central server maintains a block table that contains the location information for all blocks in the shared memory space When a block fault occurs, the fault handler sends a request to the central server. The central server forwards the request to the node holding block and updates its block table. Upon receiving the request, the owner transfers the block to the requesting node. Drawbacks: Central node is a bottleneck. If the central node fails, the DSM stops functioning. 3. Fixed Distributed Server Algorithm: Under this scheme: Several nodes have block managers, each block manager manages a predetermined set of blocks Each node maintains a mapping from data blocks to block managers When a block fault occurs, the fault handler sends a request to the corresponding block manager The block manager forwards the request to the corresponding node and updates its table to reflect the new owner (the node requesting the block) Upon receiving the request, the owner transfers the block to the requesting node. 4. Dynamic Distributed Server Algorithm: Under this approach there is no block manager. Each node maintains information about the probable owner of each block. When a block fault occurs, the fault handler sends a request to the probable owner of the block. Upon receiving the request, if the receiving node is the owner of the block, it updates its block table and transfers the block to the requesting node; otherwise, it forwards the request to the probable owner of the block as indicated by its block table.

(d) Centralized-Server Algorithm A central server maintains a block table containing owner-node and copy-set information for each block. When a read/write fault for a block occurs at node N, the fault handler at node N sends a read/write request to the central server. Upon receiving the request, the central-server does the following: If it is a read request: adds N to the copy-set field and sends the owner node information to node N upon receiving this information, N sends a request for the block to the owner node. upon receiving this request, the owner returns a copy of the block to N.

If it is a write request: It sends the copy-set and owner information of the block to node N and initializes copy-set to {N}

Node N sends a request for the block to the owner node and an invalidation message to all blocks in the copy-set. Upon receiving this request, the owner sends the block to node N

4. Explain the following with respect to Resource Management in Distributed Systems: a. Task assignment Approach b. Load Balancing Approach c. Load Sharing Approach Answer: A resource manager schedules the processes in a distributed system to make use of the system resources in such a manner that resource usage, response time, network congestion, and scheduling overhead are optimized. The following are different approaches for Process Scheduling: 1. Task Assignment Approach: Each process is viewed as a collection of tasks. These tasks are scheduled to suitable processors to improve performance. This is not a widely used approach because It requires characteristics of all the processes to be known in advance. This approach does not take into consideration the dynamically changing state of the system. 2. Load Balancing Approach: Processes are distributed among nodes to equalize the load among all nodes. 3. Load-Sharing Approach: No node is allowed to be idle, while processes are waiting to be served at other nodes. Requires the knowledge of the load in the entire system. (a) Task Assignment Approach In this approach, a process is considered to be composed of multiple tasks and the goal is to find an optimal assignment policy for the tasks of an individual process. The following are typical assumptions for the task assignment approach: A process is already split into pieces, called tasks The amount of computation required for each task and the speed of the processors are known Cost of processing each task at every node is known The interprocess communication between any two processes is known Resource requirements of each task Reassignment of tasks is generally not possible Some of the goals of a good task assignment algorithm are: Minimize IPC cost (this problem can be modeled using network flow model) Efficient resource utilization Quick turnaround time A high degree of parallelism

(b) Load-Balancing Approach The scheduling algorithms that use this approach are known as Load Balancing or Load-Leveling Algorithms. These algorithms are based on the intuition that for better resource utilization, it is desirable for the load in a distributed system to be balanced evenly. Thus a load balancing algorithm tries to

balance the total system load by transparently transferring the workload from heavily loaded nodes to lightly loaded nodes in an attempt to ensure good overall performance relative to some specific metric of system performance. We can have the following categories of load balancing algorithms: 1. Static: Ignore the current state of the system. e.g. If a node is heavily loaded, it picks up a task randomly and transfers it to a random node. These algorithms are simpler to implement but performance may not be good. 2. Dynamic: Use the current state information for load balancing. There is an overhead involved in collecting state information periodically; they perform better than static algorithms. 3. Deterministic: Algorithms in this class use the processor and process characteristics to allocate processes to nodes. 4. Probabilistic: Algorithms in this class use information regarding static attributes of the system such as number of nodes, processing capability, etc. 5. Centralized: System state information is collected by a single node. This node makes all scheduling decisions. 6. Distributed: Most desired approach. Each node is equally responsible for making scheduling decisions based on the local state and the state information received from other sites. 7. Cooperative: A distributed dynamic scheduling algorithm. In these algorithms, the distributed entities cooperate with each other to make scheduling decisions. Therefore they are more complex and involve larger overhead than non-cooperative ones. But the stability of a cooperative algorithm is better than that of a non-cooperative one. 8. Non-cooperative: A distributed dynamic scheduling algorithm. In these algorithms, individual entities act as autonomous entities and make scheduling decisions independently of the action of other entities.

(c) Load Sharing Approach

Several researchers believe that load balancing, with its implication of attempting to equalize workload on all the nodes of the system, is not an appropriate objective. This is because the overhead involved in gathering the state information to achieve this objective is normally very large, especially in distributed systems having a large number of nodes. In fact, for the proper utilization of resources of a distributed system, it is not required to balance the load on all the nodes. It is necessary and sufficient to prevent the nodes from being idle while some other nodes have more than two processes. This rectification is called the Dynamic Load Sharing instead of Dynamic Load Balancing.

5. Explain the following with respect to Distributed File Systems: a. The Key Challenges of Distributed Systems b. Clients Perspective: File Services c. File Access Semantics

d. Servers Perspective Implementation e. Stateful Versus Stateless Servers Answer: (a) The Key Challenges of Distributed Systems A good distributed file system should have the features described below: i) Transparency Location: a client cannot tell where a file is located Migration: a file can transparently move to another server Replication: multiple copies of a file may exist Concurrency: multiple clients access the same file ii) Flexibility In a flexible DFS it must be possible to add or replace file servers. Also, a DFS should support multiple underlying file system types (e.g., various Unix file systems, various Windows file systems, etc.) iii) Reliability In a good distributed file system, the probability of loss of stored data should be minimized as far as possible. i.e. users should not feel compelled to make backup copies of their files because of the unreliability of the system. Rather, the file system should automatically generate backup copies of critical files that can be used in the event of loss of the original ones. Stable storage is a popular technique used by several file systems for higher reliability. iv) Consistency: Employing replication and allowing concurrent access to files may introduce consistency problems. v) Security: Clients must authenticate themselves and servers must determine whether clients are authorised to perform requested operation. Furthermore communication between clients and the file server must be secured. vi) Fault tolerance: Clients should be able to continue working if a file server crashes. Likewise, data must not be lost and a restarted file server must be able to recover to a valid state. vii) Performance: In order for a DFS to offer good performance it may be necessary to distribute requests across multiple servers. Multiple servers may also be required if the amount of data stored by a file system is very large. viii) Scalability: A scalable DFS will avoid centralised components such as a centralised naming service, a centralised locking facility, and a centralised file store. A scalable DFS must be able to handle an increasing number of files and users. It must also be able to handle growth over a geographic area (e.g., clients that are widely spread over the world), as well as clients from different administrative domains. (b) Clients Perspective: File Services The File Service Interface represents files as an uninterpreted sequence of bytes that are associated with a set of attributes (owner, size, creation date, permissions, etc.) including information regarding protection (i.e., access control lists or capabilities of clients). Moreover, there is a choice between the

upload/download model and the remote access model. In the first model, files are downloaded from the server to the client. Modifications are performed directly at the client after which the file is uploaded back to the server. In the second model all operations are performed at the server itself, with clients simply sending commands to the server. There are benefits and drawbacks to both models. The first model, for example, can avoid generating traffic every time it performs operations on a file. Also, a client can potentially use a file even if it cannot access the file server. A drawback of performing operations locally and then sending an updated file back to the server is that concurrent modification of a file by different clients can cause problems. The second approach makes it possible for the file server to order all operations and therefore allow concurrent modifications to the files. A drawback is that the client can only use files if it has contact with the file server. If the file server goes down, or the network connection is broken, then the client loses access to the files. (c) File Access Semantics Ideally, the client would perceive remote files just like local ones. Unfortunately, the distributed nature of a DFS makes this goal hard to achieve. In the following discussion, we present the various file access semantics available, and discuss how appropriate they are to a DFS. The first type of access semantics that we consider are called Unix semantics and they imply the following: A read after a write returns the value just written. When two writes follow in quick succession, the second persists. In the case of a DFS, it is possible to achieve such semantics if there is only a single file server and no client-side caching is used. In practice, such a system is unrealistic because caches are needed for performance and write-through caches (which would make Unix semantics possible to combine with caching) are expensive. Furthermore deploying only a single file server is bad for scalability. Because of this it is impossible to achieve Unix semantics with distributed file systems. Alternative semantic models that are better suited for a distributed implementation include: 1. Session semantics, 2. Immutable files, and 3. Atomic transactions. 1. Session Semantics: In the case of session semantics, changes to an open file are only locally visible. Only after a file is closed, are changes propagated to the server (and other clients). This raises the issue of what happens if two clients modify the same file simultaneously. It is generally up to the server to resolve conflicts and merge the changes. Another problem with session semantics is that parent and child processes cannot share file pointers if they are running on different machines. 2. Immutable Files: Immutable files cannot be altered after they have been closed. In order to change a file, instead of overwriting the contents of the existing file a new file must be created. This file may then replace the old one as a whole. This approach to modifying files does require that directories (unlike files) be updatable. Problems with this approach include a race condition when two clients try to replace the same file as well as the question of what to do with processes that are reading a file at the same time as it is being replaced by another process. 3. Atomic Transactions: In the transaction model, a sequence of file manipulations can be executed indivisibly, which implies that two transactions can never interfere. This is the standard model for databases, but it is expensive to implement.

(d) Servers Perspective: Implementation Observations about the expected use of a file system can be used to guide the design of a DFS. For example, a study by Satyanarayanan found the following usage patterns for Unix systems at a university: Most files are small less than 10k Reading is much more common than writing Usually access is sequential; random access is rare Most files have a short lifetime File sharing is unusual Most processes use only a few files Distinct files classes with different properties exist These usage patterns (small files, sequential access, high read-write ratio) would suggest that an update/download model for a DFS would be appropriate. Note, however, that different usage patterns may be observed at different kinds of institutions. In situations where the files are large, and are updated more often it may make more sense to use a DFS that implements a remote access model. Besides the usage characteristics, implementation tradeoffs may depend on the requirements of a DFS. These include supporting a large file system, supporting many users, the need for high performance, and the need for fault tolerance. Thus, for example, a fault tolerant DFS may sacrifice some performance for better reliability guarantees, while a high performance DFS may sacrifice security and wide-area scalability in order to achieve extra performance. (e) Stateful Vs Stateless Servers The file servers that implement a distributed file service can be stateless or stateful. Stateless file servers do not store any session state. This means that every client request is treated independently, and not as part of a new or existing session. Stateful servers, on the other hand, do store session state. They may, therefore, keep track of which clients have opened which files, current read and write pointers for files, which files have been locked by which clients, etc. The main advantage of stateless servers is that they can easily recover from failure. Because there is no state that must be restored, a failed server can simply restart after a crash and immediately provide services to clients as though nothing happened. Furthermore, if clients crash the server is not stuck with abandoned opened or locked files. Another benefit is that the server implementation remains simple because it does not have to implement the state accounting associated with opening, closing, and locking of files. The main advantage of stateful servers, on the other hand, is that they can provide better performance for clients. Because clients do not have to provide full file information every time they perform an operation, the size of messages to and from the server can be significantly decreased. Likewise the server can make use of knowledge of access patterns to perform read-ahead and do other optimisations. Stateful servers can also offer clients extra services such as file locking, and remember read and write positions.

6. Describe the Clock Synchronization Algorithms and Distributed Algorithms in the context of Synchronization. Answer: Clock Synchronization Algorithms Clock synchronization algorithms may be broadly classified as Centralized and Distributed:

(a) Centralized Algorithms In centralized clock synchronization algorithms one node has a real-time receiver. This node, called the time server node whose clock time is regarded as correct and used as the reference time. The goal of these algorithms is to keep the clocks of all other nodes synchronized with the clock time of the time server node. Depending on the role of the time server node, centralized clock synchronization algorithms are again of two types Passive Time Sever and Active Time Server. 1. Passive Time Server Centralized Algorithm: In this method each node periodically sends a message to the time server. When the time server receives the message, it quickly responds with a message (time = T), where T is the current time in the clock of the time server node. Assume that when the client node sends the time = ? message, its clock time is T0, and when it receives the time = T message, its clock time is T1. Since T0 and T1 are measured using the same clock, in the absence of any other information, the best estimate of the time required for the propagation of the message time = T from the time server node to the clients node is (T1- T0)/2. Therefore, when the reply is received at the clients node, its clock is readjusted to T + (T1- T0)/2. 2. Active Time Server Centralized Algorithm: In this approach, the time server periodically broadcasts its clock time (time = T). The other nodes receive the broadcast message and use the clock time in the message for correcting their own clocks. Each node has a priori knowledge of the approximate time (TA) required for the propagation of the message time = T from the time server node to its own node, Therefore, when a broadcast message is received at a node, the nodes clock is readjusted to the time T+TA. A major drawback of this method is that it is not fault tolerant. If the broadcast message reaches too late at a node due to some communication fault, the clock of that node will be readjusted to an incorrect value. Another disadvantage of this approach is that it requires broadcast facility to be supported by the network. Another active time server algorithm that overcomes the drawbacks of the above algorithm is the Berkeley algorithm proposed by Gusella and Zatti for internal synchronization of clocks of a group of computers running the Berkeley UNIX. In this algorithm, the time server periodically sends a message (time = ?) to all the computers in the group. On receiving this message, each computer sends back its clock value to the time server. The time server has a priori knowledge of the approximate time required for the propagation of a message from each node to its own node. Based on this knowledge, it first readjusts the clock values of the reply messages, It then takes a fault-tolerant average of the clock values of all the computers (including its own). To take the fault tolerant average, the time server chooses a subset of all clock values that do not differ from one another by more than a specified amount, and the average is taken only for the clock values in this subset. This approach eliminates readings from unreliable clocks whose clock values could have a significant adverse effect if an ordinary average was taken. The calculated average is the current time to which all the clocks should be readjusted, The time server readjusts its own clock to this value, Instead of sending the calculated current time back to other computers, the time server sends the amount by which each individual computers clock requires adjustment, This can be a positive or negative value and is calculated based on the knowledge the time server has about the approximate time required for the propagation of a message from each node to its own node. Centralized clock synchronization algorithms suffer from two major drawbacks: 1. They are subject to single point failure. If the time server node fails, the clock synchronization operation cannot be performed. This makes the system unreliable. Ideally, a distributed system, should be more reliable than its individual nodes. If one goes down, the rest should continue to function correctly. 2. From a scalability point of view it is generally not acceptable to get all the time requests serviced by a single time server. In a large system, such a solution puts a heavy burden on that one process. Distributed algorithms overcome these drawbacks.

(b) Distributed Algorithms We know that externally synchronized clocks are also internally synchronized. That is, if each nodes clock is independently synchronized with real time, all the clocks of the system remain mutually synchronized. Therefore, a simple method for clock synchronization may be to equip each node of the system with a real time receiver so that each nodes clock can be independently synchronized with real time. Multiple real time clocks (one for each node) are normally used for this purpose. Theoretically, internal synchronization of clocks is not required in this approach. However, in practice, due to inherent inaccuracy of real-time clocks, different real time clocks produce different time. Therefore, internal synchronization is normally performed for better accuracy. One of the following two approaches is used for internal synchronization in this case. 1. Global Averaging Distributed Algorithms: In this approach, the clock process at each node broadcasts its local clock time in the form of a special resync message when its local time equals T0+iR for some integer I, where T0 is a fixed time in the past agreed upon by all nodes and R is a system parameter that depends on such factors as the total number of nodes in the system, the maximum allowable drift rate, and so on. i.e. a resync message is broadcast from each node at the beginning of every fixed length resynchronization interval. However, since the clocks of different nodes run slightly different rates, these broadcasts will not happen simultaneously from all nodes. After broadcasting the clock value, the clock process of a node waits for time T, where T is a parameter to be determined by the algorithm. During this waiting period, the clock process records the time, according to its own clock, when the message was received. At the end of the waiting period, the clock process estimates the skew of its clock with respect to each of the other nodes on the basis of the times at which it received resync messages. It then computes a fault-tolerant average of the next resynchronization interval. The global averaging algorithms differ mainly in the manner in which the fault-tolerant average of the estimated skews is calculated. Two commonly used algorithms are: 1. The simplest algorithm is to take the average of the estimated skews and use it as the correction for the local clock. However, to limit the impact of faulty clocks on the average value, the estimated skew with respect to each node is compared against a threshold, and skews greater than the threshold are set to zero before computing the average of the estimated skews. 2. In another algorithm, each node limits the impact of faulty clocks by first discarding the m highest and m lowest estimated skews and then calculating the average of the remaining skews, which is then used as the correction for the local clock. The value of m is usually decided based on the total number of clocks (nodes). 2.Localized Averaging Distributed Algorithms: In this approach, the nodes of a distributed system are logically arranged in some kind of pattern, such as a ring or a grid. Periodically, each node exchanges its clock time with its neighbors in the ring, grid, or other structure and then sets its clock time to the average of its own clock time to the average of its own clock time and the clock times of its neighbors.