Master of Computer Application (MCA) – Semester 5 MC0085 – Advanced Operating Systems Assignment Set – 1

1. Discuss the following with respect to Message Passing in Distributed Systems: a. Synchronization Ans: A major issue in communication is the synchronization imposed on the communicating processes by the communication primitives. There are two types of communicating primitives: Blocking Semantics and Non-Blocking Semantics.  Blocking Semantics: A communication primitive is said to have blocking semantics if its invocation blocks the execution of its invoker (for example in the case of send, the sender blocks until it receives an acknowledgement from the receiver.) Non-blocking Semantics: A communication primitive is said to have non-blocking semantics if its invocation does not block the execution of its invoker. The synchronization imposed on the communicating processes basically depends on one of the two types of semantics used for the send and receive primitives. Blocking Primitives Blocking Send Primitive: In this case, after execution of the send statement, the sending process is blocked until it receives an acknowledgement from the receiver that the message has been received. Non-Blocking Send Primitive: In this case, after execution of the send statement, the sending process is allowed to proceed with its execution as soon as the message is copied to the buffer. Blocking Receive Primitive: In this case, after execution of the receive statement, the receiving process is blocked until it receives a message. Non-Blocking Receive Primitive: In this case, the receiving process proceeds with its execution after the execution of receive statement, which returns the control almost immediately just after telling the kernel where the message buffer is. Handling non-blocking receives: The following are the two ways of doing this: – Polling: a test primitive is used by the receiver to check the buffer status

– Interrupt: When a message is filled in the buffer, software interrupt is used to notify

the receiver. However, user level interrupts make programming difficult. Handling blocking receives: A timeout value may be used with a blocking receive primitive to prevent a receiving process from getting blocked indefinitely if the sender has failed. Synchronous Vs Asynchronous Communication When both send and receive primitives of a communication between two processes use blocking semantics, the communication is said to be synchronous. If one or both of the primitives is non-blocking, then the communication is said to be asynchronous. Synchronous communication is easy to implement. It contributes to the reliable delivery of messages. Asynchronous communication limits concurrency and is prone to communication deadlocks.

b. Buffering
Ans: The transmission of messages from one process to another can be done by copying

the body of the message from the sender’s address space to the receiver’s address space. In some cases, the receiving process may not be ready to receive the message but it wants the operating system to save that message for later reception. In such cases, the operating system would rely on the receiver’s buffer space in which the transmitted messages can be stored prior to receiving process executing specific code to receive the message. The synchronous and asynchronous modes of communication correspond to the two extremes of buffering: a null buffer, or no buffering, and a buffer with unbounded capacity. Two other commonly used buffering strategies are single-message and finitebound, or multiple message buffers. These four types of buffering strategies are given below: o No buffering: In this case, message remains in the sender’s address space until the receiver executes the corresponding receive. o Single message buffer: A buffer to hold a single message at the receiver side is used. It is used for implementing synchronous communication because in this case an application can have only one outstanding message at any given time. o Unbounded - Capacity buffer: Convenient to support asynchronous communication. However, it is impossible to support unbounded buffer. o Finite-Bound Buffer: Used for supporting asynchronous communication.

Different program objects. indicating that the message could not be delivered to the receiver because the buffer is full. proper encoding mechanisms should be adopted to pass such objects. and character strings occupy different storage space. For example. Tagged representation: The type of each program object as well as its value is encoded in the message. o Flow-controlled communication: The sender is blocked until the receiver accepts some messages. This violates the semantics of asynchronous send. Untagged representation: The message contains only program objects. In this method. the receiver should be able to identify the type and size of the objects. An absolute pointer value has no meaning (more on this when we talk about RPC). long integers. such as integers. So. This implies ideally that the structure of the program should be preserved while they are being transmitted from the address space of the sending process to the address space of the receiving process. a pointer to a tree or linked list. short integers.Buffer overflow can be handled in one of the following ways: o Unsuccessful communication: send returns an error message to the sending process. it is very difficult to achieve this goal mainly because of two reasons: 1. it is a simple matter for the receiving process to check the type of each program object in the message because of the self-describing nature of the coded data format. A message data should be meaningful to the receiving process. This will also result in communication deadlock. One of the following two representations may be used for the encoding and decoding of a message data: 1. no information is included in the message about the type of each program object. from the encoding of these objects. the receiving object should have a prior knowledge of how to decode the received data because the coded data format is not self-describing. It is not possible in heterogeneous systems in which the sending and receiving processes are on computers of different architectures. Even in homogeneous systems. In this method. So. . The untagged representations used in SUN’s XDR format and tagged representation is used in Mach distributed operating system. 2. 2.

Unsuccessful execution of request: This may be due to receiver node crash while processing the request.c. port number). a process can specify a service instead of a process. receive any (pid. it may not be possible to locate the process. e. receive (pid. msg) Methods for process addressing: o machine id@local id: UNIX uses this form of addressing (IP address. send any (service id. msg). e. such failures may lead to the following problems:    Loss of request Message: This can be due to link failure or receiver node is down. When a process migrates to another node.g. o Implicit Addressing: A process does not explicitly name a process for communication. machine id2 identifies the last known location of the process. Disadvantages: – Overhead involved in locating a process may be large. o machine id1@local id@machine id2: machine id1 identifies the node on which the process is created. Disadvantages: Does not allow process migration. For example. msg). Process Addressing Ans: A message passing system generally supports two types of addressing: o Explicit Addressing: The process with which communication is desired is explicitly specified as a parameter in the communication primitive. – If the node on which the process was executing is down.g. it is also prone to partial failures such as a node crash or a communication link failure. . d. the link information (the machine id to which the process migrates) is left with the current machine. Advantages: No global coordination needed for process addressing. Loss of response message: This may be due to link failure or the sender is down when the response reaches it. send (pid. local id is generated by the node on which the process is created. msg). Failure Handling Ans: While a distributed system may offer potential for parallelism. This information is used for forwarding messages to migrated processes. During Interprocess communication.

On the other hand. i. Reply and Acknowledgement from the Server machine. Idempotency and handling of duplicate request messages Idempotency basically means “repeatability”. A four message reliable IPC: In this method there are four messages involved: Request and Acknowledgement from the client machine. For example a debit operation on a bank account. The server sends a reply message to the client and waits for the acknowledgement until the specified timeout period. Two message reliable IPC: In this method there is no requirement either from the client or the server for receiving acknowledgements from each other. The same process occurs even at the server side. operations that do not necessarily produce the same results when executed repeatedly with the same arguments are said to be non-idempotent. assume an sqrt procedure for calculating the square root of a given number. but which may be impractical in real time situations.e the client machine sends a request message to the server machine and waits for an acknowledgement from the server. . If the acknowledgement is not received within the specified timeout period. 3. wherein it may attach the acknowledgement to the client with a message in the form of a reply to the client. This process continues till an acknowledgement is received. In this method. The server now waits for an acknowledgement from the client and on non-receipt of the acknowledgement within the specified time period. the scenario here slightly varies.A Solution to overcome the above said problem may be done using the following methods: 1. it retransmits the reply message to the client and this cycle continues until the client responds with an acknowledgement. wherein the client machine does not wait for an acknowledgement to be received from the server machine. the kernels of both the client and server will continue to retransmit after timeout until an acknowledgement is received from both.e. 2. an Idempotent operation produces the same results without any side effects no matter how many times it is performed with the same arguments. it resends the reply back to the client machine and the process continues till the client responds with an acknowledgement. For example. the server may use the concept of piggybacking. sqrt (64) always returns 8. i. The client machine just sends the request to the specified server. But here the server machine expects an acknowledgement from the client machine when it responds to the client’s request message. the client retransmits its request to the server and waits for an acknowledgement. They just exchange the messages in the form of requests and replies or responses to each other assuming that their messages have been sent (ideal scenario). Three message reliable IPC: As mentioned in point number 1 above. In this case. On non-receipt of the acknowledgement within the timeout period.

Depending on single or multiple senders and receivers. On the other hand an open group is one in which any process in the system can send a message to the group as a whole. Group Communication Ans: The most elementary form of message-based interaction is one-to-one communication in which a single-sender process sends a message to a single receiver process. then care should be taken to implement its reply as an idempotent operation. Many-to-one (multiple senders and single receiver) 3. if requests can be retransmitted. Such groups are of two types – closed and open. The low-level group name depends to a large extent on the underlying hardware. for performance and ease of programming several highly parallel distributed applications require that a message passing system should also provide group communication facility. e. Not all operations are idempotent So. receiver processes of a message form a group. The higher level group is an ASCII string that is independent of the location information of the processes in the group. An outside process cannot send a message to the group as a whole. on some networks it is possible to create a special network address to which multiple machines can listen. Whether to use a closed group or an open group is application dependent. For example. . the following three types of group communication are possible: 1. although it may send a message to an individual member of the group. However. Many-to-many (multiple senders and multiple receivers) The following are the One-to-Many Multicast issues in a group communication to be addressed: i) Group Management: In case of one-to-many communication. A message passing system with group communication facility provides the flexibility to create an delete groups dynamically and to allow a process to join or leave a group at any time.   An idempotent operation produces the same result without any side effect no matter how many times it is executed. One-to-many (Single sender and multiple receivers) 2. ii) Group Addressing: A two-level naming scheme is normally used for group addressing. A closed group is one in which only the members of the group can send a message to the group.

class D IP addresses are used for multicast. on the Internet. : No central server keeps the information. For example.  Broadcast address: A certain address is declared as a broadcast address and packets sent to that address are delivered to all in the network. called multicast address.  iii) Message Delivery Approach: The following are the two possible approaches for message delivery. A disadvantage is for each member a separate copy of each packet needs to be sent. If there is no facility to create multicast or broadcast addresses. The remaining 28 bits specify a specific multicast group. Multicast send is inherently asynchronous: processes that belong to the multicast group are ready to receive. If unbuffered. The format of class D IP addresses for IP multicasting: -------------------------------------|1|1|1|0| Group identification| -------------------------------------The first four bits contain 1110 and identify the address as multicast. : A centralized group server maintains information about the groups and their members. then underlying unicast is used. packets could be lost. A packet sent to multicast address is delivered to all who have subscribed to that group. Buffered or Unbuffered: A multicast packet can be buffered until the receiver is ready to receive. Create a special network address. 1-reliable: Sender expects response from one receiver (may be the multicast server can take the responsibility). Flexible Reliability in Multicast Communication: Different levels of reliability O-reliable: No response is expected from any receivers. .

• Multicast recipients may be sending acknowledgements to the sender . absolute ordering is not required by many applications. Different Implementation methods: • The Kernel of the sender is responsible for retransmitting until everyone receives. . all messages are delivered to all processes in the exact order in which they were sent. • Each receiver of the multicast message performs an atomic multicast of the same message to the same group. all messages are received by all processes in the same order. All-reliable: The sender expects response from all receivers. An important issue here is that of ordered delivery of messages.m-out-of-n-reliable: The sender expects response from m out of n receivers. iv) Many-to-One Communication: In this type of communication. This method ensures all surviving processes will receive the message even if some receivers fail after receiving the message or the sender machine fails after sending the message. • Moreover. For example. multiple senders send messages to multiple receivers. • Not possible to implement in the absence of global clock. Ordered delivery ensures that all messages are delivered to all receivers in an order acceptable to the application. • A database server may be receiving requests from several clients v) Many-to-Many Communication: In this type of communication. Atomic Multicast: A multicast message is received by all the members of the group or none. ii) Consistent Ordering: In this type. • A buffer process may receive messages from several consumers and producers. The following are the various message ordering semantics followed in case of a Many-to-Many communication: i) Absolute Ordering: In this type. multiple senders send messages to a single receiver. This method works only if the sender’s machine and none of the receiver processes fail.

messages are always delivered in proper order. • Weaker the consistency model. this model is impossible to implement. ii) Sequential consistency: Proposed by Lamport (1979). This semantics ensures that if the event of sending one message is causally related to the event of sending another message. This is possible to implement only in systems with the notion of global time. The basic idea behind causal ordering semantics is that when it matters. but when it does not matter.. i. Hence. • Note that an application written for a DSM that implements a stronger consistency model may not work correctly under a DSM that implements a weaker consistency model. Memory Consistency models Ans: i) Strict consistency: Each read operation returns the most recently written value.iii) Causal Ordering: For some applications. • Researchers try to invent new consistency models which are weaker than the existing ones in such a way that a set of applications will function correctly under the new consistency model. One such weak ordering semantics that is acceptable to many applications is the causal ordering semantics. they may be delivered in any arbitrary order. 2.e. better the concurrency. two message sending events are causally related if there is any possibility of the second one being influenced in any way by the first one. consistent-ordering semantics is not necessary and even weaker semantics is acceptable. So. Two message sending events are said to be causally related if they are correlated by the happened-before relation. i. Memory Coherence (Consistency) Models Ans: Memory Consistency Model is:  A set of rules that the applications must obey if they want the DSM system to provide the degree of consistency guaranteed by the consistency model. DSM systems based on underlying distributed systems have to use weaker consistency models. write(w1) and read(r2) are performed on a . the two messages are delivered to all receivers in the correct order. b. Discuss the following with respect to Distributed Shared Memory: a. if three operations read(r1).e. An application can have better performance if the message-passing system used supports a weaker ordering semantics that is acceptable to the application. All processes in the system observe the same order of all memory access operations on the shared memory.

iii) Causal consistency model: Proposed by Hutto and Ahamad (1990). In addition to PRAM consistency. For example. A write operation performed by one process P1 is not causally related to the write operation performed by another process P2 if P1 has read neither the value written by P2 or any memory variable that was directly or indirectly derived from the value written by P2 and vice versa. For implementing DSMs that support causal consistency one has to keep track of which memory operation is dependent on which other operation. In this model. This model is weaker than the strict consistency model. Processor Consistency Model: Proposed by Goodman (1989).w1. (r2.. A DSM that supports weak consistency model uses a special variable. r2). This model is weaker than Sequential consistency model Pipelined Random . then any of the six orderings (r1. r1). the following should be satisfied:   All accesses to synchronization variables must obey sequential consistency semantics. called synchronization variable. It requires that memory become consistent only on synchronization accesses. r1).Access Memory (PRAM) consistency model. for any memory location. Sequential consistency is the most intuitively expected semantics for memory coherence. iii) v) vi) . It can be implemented by serializing all requests on a central server node.w1. if a process did a read operation and then performed a write operation. then the value written may have depended in some way on the value read. This model distinguishes between ordinary accesses and synchronization accesses. (w1. (1988). all write operations that are potentially causally related are seen by all processes in the same (correct) order.. all processes agree on the same order of all write operations to that location. Weak Consistency Model: Proposed by Dubois et al.memory address in that order. r2. all write operations performed by a single process are seen by all other processes in the order in which they were performed. For supporting weak consistency. This model provides onecopy/single-copy semantics because all processes sharing a memory location always see exactly the same contents stored in it.. This model was proposed by Lipton and Sandberg (1988). sequential consistency is acceptable for most applications. This model is weaker than all the above consistency models. The operations on it are used to synchronize memory. In this model. is acceptable provided all processes see the same ordering. So. This model can be implemented easily by sequencing the write operations performed by each node independently.

All previous write operations must be completed everywhere before an access to synchronization variable is allowed. all modifications of other nodes are acquired by the process’s node. It minimizes network traffic. • All changes made to the memory are propagated to other nodes.e. migrating blocks (NRMBs) iii) Replicated. access to a non – synchronization variable is allowed. In this approach. when a process does a release access. – Acquire is used by a process to tell the system that it is about to enter a critical section. vii) Release Consistency Model: In the weak consistency model. non-migrating blocks (NRNMBs) ii) Non-replicated.e. Implementing Sequential Consistency Ans: Sequential consistency supports the intuitively expected semantics. the entire shared memory is synchronized when a synchronization variable is accessed by a process i. migrating blocks (RMBs) iv) Replicated. this is the most preferred choice for designers of DSM system. non-migrating blocks (RNMBs) . The replication and migration strategies for DSM design include: i) Non-replicated. a release consistency DSM system will produce the same results for an application as that if the application was executed on a sequentially consistent DSM system. So. b. i. When a process does an acquire access. – Release is used to tell the system that it had exited critical section. If processes use appropriate synchronization accesses properly. So. • All changes made to the memory by other processes are propagated from other nodes to the process’s node. two synchronization variables. This is not really necessary because the first operation needs to be performed only when a process exits from critical section and the second operation needs to be performed only when the process enters critical section. called acquire and release have been proposed. the contents of all the modifications are not immediately sent to other nodes but they are sent only on demand. instead of one synchronization variable. viii) Lazy Release consistency model: It is a variation of release consistency model.

the memory management unit (MMU) and the operating system of the owner node perform the access request and return the result. so sequential consistency is ensured. – The node that currently owns the block responds by transferring the block. – This approach does not scale well. Centralized Server Algorithm: A central server maintains a block table that contains the location information for all blocks in the shared memory space – When a block fault occurs. 2. Parallelism is not possible in this strategy.i) Implementing under NRNMBs strategy: Under this strategy. only one copy of each block of the shared memory is in the system and its location is fixed. the owner transfers the block to the requesting node. the fault handler broadcasts a request on the network. ii) Implementing under NRMBs strategy Under this strategy. – Drawbacks: . – Upon receiving the request. All requests for a block are sent to the owner node of the block. Upon receiving a request from a client node. The advantages of this strategy include: – No communication cost for local data access. because the owner node needs to only process all requests on a block in the order it receives. Sequential consistency can be trivially enforced. the fault handler sends a request to the central server. – The central server forwards the request to the node holding block and updates its block table. need to be maintained at each node. – Allows applications to take advantage of data access locality The disadvantages of this strategy include: – Prone to thrashing – Parallelism cannot be achieved in this method also – Locating a block in the NRMB strategy: 1. only the processes executing on one node can read or write a given data item at any time. Broadcasting: Under this approach: Each node maintains a owned blocks table – When a block fault occurs.

Fixed Distributed – Server Algorithm: Under this scheme: • Several nodes have block managers. read operations can be carried out in parallel at multiple nodes by accessing the local copy of the data. the fault handler sends a request to the probable owner of the block. When a read/write fault for a block occurs at node N. Therefore the average cost of read operations is reduced because no communication overhead is involved if a replica of the data exists at the local node. the fault handler at node N sends a read/write request to the central server. When a block fault occurs. the owner transfers the block to the requesting node. However. replication tends to increase the cost of write operations because for a write to a block all its replicas must be invalidated or updated to maintain consistency. otherwise. virtually all DSM systems replicate blocks. Implementing under RMBs strategy A major disadvantage of non replication strategies is lack of parallelism because only the processes on one node can access data contained in any given block at any given time. the fault handler sends a request to the corresponding block manager • The block manager forwards the request to the corresponding node and updates its table to reflect the new owner (the node requesting the block) • Upon receiving the request.3. Upon receiving the request. Dynamic Distributed Server Algorithm: Under this approach there is no block manager. it forwards the request to the probable owner of the block as indicated by its block table. each block manager manages a predetermined set of blocks • Each node maintains a mapping from data blocks to block managers • When a block fault occurs. Centralized – Server Algorithm Ans: A central server maintains a block table containing owner-node and copyset information for each block. if the receiving node is the owner of the block. Upon receiving the request. d. Each node maintains information about the probable owner of each block. 4. To increase parallelism. the central-server does the following:  If it is a read request: • adds N to the copy-set field and • sends the owner node information to node N . it updates its block table and transfers the block to the requesting node. With replicated blocks.

• upon receiving this request. Task assignment Approach Ans: In this approach. a process is considered to be composed of multiple tasks and the goal is to find an optimal assignment policy for the tasks of an individual process.  If it is a write request: • It sends the copy-set and owner information of the block to node N and initializes copy-set to {N} o Node N sends a request for the block to the owner node and an invalidation message to all blocks in the copy-set. Load – Balancing Approach Ans: The scheduling algorithms that use this approach are known as Load Balancing or Load-Leveling Algorithms. o Upon receiving this request. N sends a request for the block to the owner node. These algorithms are based on the intuition that for better resource utilization.• upon receiving this information. called tasks       The amount of computation required for each task and the speed of the processors are known Cost of processing each task at every node is known The interprocess communication between any two processes is known Resource requirements of each task Reassignment of tasks is generally not possible Some of the goals of a good task assignment algorithm are:     Minimize IPC cost (this problem can be modeled using network flow model) Efficient resource utilization Quick turnaround time A high degree of parallelism b. Explain the following with respect to Resource Management in Distributed Systems: a. Thus a load balancing algorithm tries to balance the total system load by . the owner returns a copy of the block to N. The following are typical assumptions for the task assignment approach:  A process is already split into pieces. the owner sends the block to node N 3. it is desirable for the load in a distributed system to be balanced evenly.

the distributed entities cooperate with each other to make scheduling decisions. In these algorithms. 7. 8. e. Therefore they are more complex and involve larger overhead than non-cooperative ones. individual entities act as autonomous entities and make scheduling decisions independently of the action of other entities. Each node is equally responsible for making scheduling decisions based on the local state and the state information received from other sites.g. is not an appropriate objective. If a node is heavily loaded. These algorithms are simpler to implement but performance may not be good. Distributed: Most desired approach. But the stability of a cooperative algorithm is better than that of a non-cooperative one. This node makes all scheduling decisions. Probabilistic: Algorithms in this class use information regarding static attributes of the system such as number of nodes. In these algorithms. 6. We can have the following categories of load balancing algorithms: 1. Dynamic: Use the current state information for load balancing. it is not required to balance the load on all the nodes. they perform better than static algorithms. In fact. Centralized: System state information is collected by a single node. Static: Ignore the current state of the system. with its implication of attempting to equalize workload on all the nodes of the system. There is an overhead involved in collecting state information periodically. 3. Non-cooperative: A distributed dynamic scheduling algorithm. Deterministic: Algorithms in this class use the processor and process characteristics to allocate processes to nodes. 4. it picks up a task randomly and transfers it to a random node. etc. c. Load – Sharing Approach Ans: Several researchers believe that load balancing. Cooperative: A distributed dynamic scheduling algorithm. especially in distributed systems having a large number of nodes. for the proper utilization of resources of a distributed system. 5.transparently transferring the workload from heavily loaded nodes to lightly loaded nodes in an attempt to ensure good overall performance relative to some specific metric of system performance. It is necessary and sufficient to prevent . 2. This is because the overhead involved in gathering the state information to achieve this objective is normally very large. processing capability.

the following two approaches are used for estimation:  Use number of processes at a node as a measure of load  Use the CPU utilization as a measure of load Process Transfer Policies: Load sharing algorithms are interested in busy or idle states only and most of them employ the all-or-nothing strategy given below: All or Nothing Strategy: It uses a single threshold policy. 2. A node becomes a candidate for transferring a task as soon as it has more than one task. It is simpler to decide about most of these policies in case of load sharing. Sender-Initiated Policy: Under this policy.the nodes from being idle while some other nodes have more than two processes. priority assignment policy. the location policies are of the following types: 1. Load Estimation Policies: In this an attempt is made to ensure that no node is idle while processes wait for service at some other node. process transfer policy. so no preemptive task transfers occur. To avoid this. Rather. the threshold value can be set to 2 instead of 1. This rectification is called the Dynamic Load Sharing instead of Dynamic Load Balancing. Location Policies: Location Policy decides the sender node or the receiver node of a process that is to be moved within the system for load sharing. an idle process is not able to immediately acquire a task. In general.  A disadvantage of this approach is it can cause system instability under high system load. The priority assignment policies and the migration limiting policies for load-sharing algorithms are the same as that of load-balancing algorithms. state information exchange policy. Under this approach. Issues in Load-Sharing Algorithms: The design of a load sharing algorithm requires that proper decisions be made regarding load estimation policy. they only attempt to ensure that no node is idle when a node is heavily loaded. because load sharing algorithms do not attempt to balance the average workload of all the nodes of the system. Depending on the type of node that takes the initiative to globally search for a suitable node for the process. and migration limiting policy. heavily loaded nodes search for lightly loaded nodes to which task may be transferred. A node becomes a candidate to accept tasks from remote nodes only when it becomes idle. thus wasting processing power. 1. Receiver-Initiated Location Policy: Under this policy. lightly loaded nodes search for heavily loaded nodes from which tasks may be transferred . The search can be done by sending a broadcast message or probing randomly picked nodes  An advantage of this approach is that sender can transfer the freshly arrived tasks.

Broadcast When State Changes: A node broadcasts a state information request message when it becomes under-loaded or overloaded. An disadvantage of this approach is it may result in preemptive task transfers because sender may not have any freshly arrived tasks. both senders and receivers search for receivers and senders respectively. Transfer Policy: A threshold policy that uses two adaptive thresholds. sender polls to find suitable receiver. 5. Poll When State Changes: When a node’s state changes. a node broadcasts this message only when it is under-loaded.  Under sender initiated policy. and under low system loads. the upper threshold. Advantage is.   The search for a sender can be done by sending a broadcast message or by probing randomly picked nodes. Symmetrically Initiated Location Policy: Under this approach. 6. 7. receiver polls to find suitable sender. and the lower threshold  A node with load lower than lower threshold is considered a receiver  A node with load higher than the higher threshold is considered a sender.  A node’s estimated average load is supposed to lie in the middle of the lower and upper thresholds. 4.  Polling stops when a suitable node is found or a threshold number of nodes have been polled.  Under receiver initiated policy. 4. Explain the following with respect to Distributed File Systems: . state information is exchanged only when the state changes. The above Average Algorithm by Krueger and Finkel (A dynamic load balancing algorithm) tries to maintain load at each node within an acceptable range of the system average. this does not cause system instability.  In the sender-initiated approach a node broadcasts this message only when it is overloaded. because under high system loads a receiver will quickly find a sender. State Information Exchange Policies: Since it is not necessary to equalize load at all nodes under load sharing.  It randomly polls other nodes one by one and exchanges state information with the polled nodes. 3.   In the receiver-initiated approach. it is OK for processes to process some additional control messages.

vii) Performance: In order for a DFS to offer good performance it may be necessary to distribute requests across multiple servers.g. Stable storage is a popular technique used by several file systems for higher reliability. and a centralised file store. i. Multiple servers may also be required if the amount of data stored by a file system is very large.a. a centralised locking facility. the file system should automatically generate backup copies of critical files that can be used in the event of loss of the original ones.) iii) Reliability In a good distributed file system. iv) Consistency: Employing replication and allowing concurrent access to files may introduce consistency problems.. A scalable DFS must be able to handle an increasing number of files and users. etc. users should not feel compelled to make backup copies of their files because of the unreliability of the system. a DFS should support multiple underlying file system types (e.e. Also. viii) Scalability: A scalable DFS will avoid centralised components such as a centralised naming service. The Key Challenges of Distributed Systems Ans: A good distributed file system should have the features described below: i) Transparency     Location: a client cannot tell where a file is located Migration: a file can transparently move to another server Replication: multiple copies of a file may exist Concurrency: multiple clients access the same file ii) Flexibility In a flexible DFS it must be possible to add or replace file servers. v) Security: Clients must authenticate themselves and servers must determine whether clients are authorised to perform requested operation. Furthermore communication between clients and the file server must be secured. It must also be able to . the probability of loss of stored data should be minimized as far as possible. Rather. data must not be lost and a restarted file server must be able to recover to a valid state. Likewise. various Windows file systems. various Unix file systems. vi) Fault tolerance: Clients should be able to continue working if a file server crashes.

c.. files are downloaded from the server to the client. permissions. In the case of a DFS. can avoid generating traffic every time it performs operations on a file. with clients simply sending commands to the server. Because of this it is impossible to achieve Unix semantics with distributed file systems. There are benefits and drawbacks to both models. for example. the client would perceive remote files just like local ones. the second persists.g. A drawback is that the client can only use files if it has contact with the file server. In the first model.) including information regarding protection (i. In practice. the distributed nature of a DFS makes this goal hard to achieve.  When two writes follow in quick succession. If the file server goes down. such a system is unrealistic because caches are needed for performance and write-through caches (which would make Unix semantics possible to combine with caching) are expensive. A drawback of performing operations locally and then sending an updated file back to the server is that concurrent modification of a file by different clients can cause problems. size. The first model. etc. Moreover. it is possible to achieve such semantics if there is only a single file server and no client-side caching is used. Alternative semantic models that are better suited for a distributed implementation include: . as well as clients from different administrative domains. or the network connection is broken. The second approach makes it possible for the file server to order all operations and therefore allow concurrent modifications to the files. Unfortunately. In the second model all operations are performed at the server itself.e. File Access Semantics Ans: Ideally. The first type of access semantics that we consider are called Unix semantics and they imply the following:  A read after a write returns the value just written. b. In the following discussion.handle growth over a geographic area (e. we present the various file access semantics available. clients that are widely spread over the world). Client’s Perspective: File Services Ans: The File Service Interface represents files as an uninterpreted sequence of bytes that are associated with a set of attributes (owner. Modifications are performed directly at the client after which the file is uploaded back to the server. and discuss how appropriate they are to a DFS. access control lists or capabilities of clients). a client can potentially use a file even if it cannot access the file server.. Furthermore deploying only a single file server is bad for scalability. then the client loses access to the files. Also. creation date. there is a choice between the upload/download model and the remote access model.

This raises the issue of what happens if two clients modify the same file simultaneously. Session Semantics: In the case of session semantics. changes to an open file are only locally visible. Atomic Transactions: In the transaction model. It is generally up to the server to resolve conflicts and merge the changes. Problems with this approach include a race condition when two clients try to replace the same file as well as the question of what to do with processes that are reading a file at the same time as it is being replaced by another process. 1. Another problem with session semantics is that parent and child processes cannot share file pointers if they are running on different machines. Server’s Perspective Implementation Ans: Observations about the expected use of a file system can be used to guide the design of a DFS. which implies that two transactions can never interfere. but it is expensive to implement. random access is rare Most files have a short lifetime File sharing is unusual Most processes use only a few files Distinct files classes with different properties exist . Atomic transactions. 2. instead of overwriting the contents of the existing file a new file must be created. and 3. 3. Session semantics. a study by Satyanarayanan found the following usage patterns for Unix systems at a university:  Most files are small – less than 10k       Reading is much more common than writing Usually access is sequential. In order to change a file. Only after a file is closed. Immutable files. d. Immutable Files: Immutable files cannot be altered after they have been closed. a sequence of file manipulations can be executed indivisibly. This file may then replace the old one as a whole. For example. This approach to modifying files does require that directories (unlike files) be updatable.1. 2. are changes propagated to the server (and other clients). This is the standard model for databases.

which files have been locked by which clients. that different usage patterns may be observed at different kinds of institutions. . on the other hand. is that they can provide better performance for clients. while a high performance DFS may sacrifice security and wide-area scalability in order to achieve extra performance. a failed server can simply restart after a crash and immediately provide services to clients as though nothing happened. the size of messages to and from the server can be significantly decreased. Stateful Versus Stateless Servers Ans: The file servers that implement a distributed file service can be stateless or stateful. keep track of which clients have opened which files. and locking of files. This means that every client request is treated independently. Besides the usage characteristics. the need for high performance. They may. however. Another benefit is that the server implementation remains simple because it does not have to implement the state accounting associated with opening. for example. implementation tradeoffs may depend on the requirements of a DFS. Because clients do not have to provide full file information every time they perform an operation. closing. Because there is no state that must be restored. The main advantage of stateful servers. supporting many users. Stateless file servers do not store any session state. Furthermore. etc. Thus. therefore. a fault tolerant DFS may sacrifice some performance for better reliability guarantees. and are updated more often it may make more sense to use a DFS that implements a remote access model. current read and write pointers for files. if clients crash the server is not stuck with abandoned opened or locked files. and remember read and write positions. sequential access. Stateful servers. Stateful servers can also offer clients extra services such as file locking. do store session state. and not as part of a new or existing session. These include supporting a large file system. In situations where the files are large. Note. Likewise the server can make use of knowledge of access patterns to perform read-ahead and do other optimisations. on the other hand.These usage patterns (small files. and the need for fault tolerance. The main advantage of stateless servers is that they can easily recover from failure. high read-write ratio) would suggest that an update/download model for a DFS would be appropriate. e.

Sign up to vote on this title
UsefulNot useful