You are on page 1of 476
Second Revised Edition : 2008 ae Pee tt ae Mrs. Snehal Kamalapur Features of Book # Includes solved University question papers. * Implemented distributed systems. % * Question banks. % Technical Publications Pune” Hardware and software concepts, Design issues, Resource sharing and the web, Challenges. System models : Introduction, Architectural model, Fundamental models and Client server models. thnk a Interprocess communication: Message oriented communication, Stream oriented communication. Layered protocols : Lower level, Transport level and Higher-level protocols. Distributed objects : RPC and LRPC, Remote method invocation, Events and notifications. Unit II : Distributed File Systems Distributed file systems : SUN NFS. CODA, Other DS, Comparisons. Name services : Name entities, Locating mobile entities, Removing unreferenced entities Case studies : DNS directory, Global name service, X.500 DS. Unit IV : Synchronization Time and global states : Clock synchronization, Logical clocks, Global state. Co-ordination : Election algorithms, Mutual exclusion, Distributed transaction. Unit V : Fault Tolerance Process resilience, Reliable client server communication, Reliable group communication, Distributed commit and recovery. Unit VI: Case Studies Case studies on CORBA, Grid and Clusters. —— a {NUE | UKF7-NGZ-7RD3 i Includes solved University question papers. i {* Implemented distributed systems. ' '# Question banks. ! Best of Technical Publications As per Revised Syllabus of Pune University - 2003 Course B.E. (Computer) Semester - II ¢ Networks and Information Security Advanced Computer Architecture and Computing Software Testing and Quality Assurance Distributed Systems Best of Technical Publications As per Revised Syllabus of Pune University - 2003 Course B.E. (1.T.) Semester - II Distributed Systems Artificial Intelligence S Information Retrieval System Operation and Maintenance (4) a eet ISBN 9788184314045 ‘Al ights reserved with Technica! Publications. No pat of this book should be reproduced in any form, Electronic, Mechanical, Photocopy or any information storage and ‘etrieval system without prior permission in writing, from Technical Publications, Pune. Published by = ‘Technical Publications Pune” #1, Aci Residency, 412, Shaniwar Peth, Pune - 411 030, Indie. Printer : ‘Alet DTPiates Siso. 10/3, Suhoped Road, Ree 411 041 The book Distributed Systems is structured into two parts. The first half highlights ithe basic introduction of distributed systems and its various model. It also reflects on ‘issues like communication in distributed systems and the distributed file systems. This half 1strongly focuses on the synchronization and the fault tolerance issues of distributed ssystems. Finally it provides three case studies of distributed systems. The second half focuses on the design issues of distributed systems. Three distributed : systems practically designed, supported along with documentation and actual coding are {presented. The book also includes solved University Question Papers and an unsolved Question {Bank for the students to practise and enhance their skills on distributed systems. Any constructive criticism for improvement of the book is welcome. All faith and honour to Ganapati Bappa for his grace and inspiration. It is with great sense of gratitude that we acknowledge Prof. S.S. Sane (Professor and head of computer department KKWIEER Nasik). His encouragement, motivation and unfailing good humor were an important ingradients in the completion of the book. We also acknowledge our parents without whose grace and support, this project would had been incomplete. We specially thank Prof. Bhardwaj for his help during the project. Our acknowledgement would be incomplete if we fail to thank the project team members of VJTI Mumbai who developed the distributed systems. Any mistake in the project is completely our responsibility Our ever helping colleague Mrs. Seema Gondhalekar is also to be thanked for her continuous moral boosting. We thank Mrs. Poonam Chaudhary for giving a helping hand. Finally, we thank Publisher and the entire team of Technical Publications, Pune for giving us the opportunity to write the book. Authors Mes, Nesta Deshpande (Mrs, Snchal Kamalapne AEA TL NE PL EL roceeeranreseneencenmanenneennnenenmnqucnaannpnentnenmncenanen: J fs, Necta Deshpande Kia, Mus. Snehat Kamalapur ; Husband Am H: Table of Contents 1.1 Examples of Distributed Systems ... 1.1.1 Intranet ‘L1L.2 Internet 1.1.3 Mobile Computing . . 1.2 Characteristics of Distributed System 1.3 Goals of a Distributed System . 1.3.1 Easily Connect Users and Resources . 1.3.2 Exhibit Transparency. 1.3.3 Support Openness. 1.3.4.1 Problems with Scalability sot oe pace a 1.4 Scaling Techniques. re 1.4.1 Hiding Communication Latencies . . z See 2 eto 1.4.3 Replication. 4.5 Hardware Concepts .... _ 1.5.1 Multiprocessors... . 1.5.2 Homogeneous Multicomputer Systems. 1.5.3 Heterogeneous Multicomputer Systems . .. 1.6 Software Concepts... 1.6.1 Distributed Operating System “ 1.6.2 Network Operating System ....................5 iia ute BAD || OO EY} 1.7 Design Issues in Distributed systems . 1.7.1 Security s 1.8 Resource Sharing and the Web. 1.9 Challenges in Distributed Systems .. 1.9.1 Failure Handling .. 1.10 System Models 2.1 Interprocess Communication 2.2 Message Oriented Communication. 2.2.1 Message Oriented Transient Communication 2.3.3 Synchronization of Stream . 2.4 Layered Protocols...... 2.6.2 Design Issues in RMI. 2.6.3 Features of RMI... 3.2 Distributed File Systems ... 3.2.1 Requirements of Distributed File Systems _. 3.2.2 Types of Distributed File Systoms.................:s:ssssesssesesesereresees 3.3 Comparison of Various File Servers Big NAG SBC ices Ad SAM Naming BRM iors cccssauee dvdr tohecdinsePeruinaarmrceeme sta 3-41 3.4.2 Features of Good Naming System ..........2.....0..... ee A? 3.4.3 Name Spaces 3.4.4 Locating Mobile Entities. . . 3.4.5 Removing Unreferenced Entities. 3.5 Case Study .... Quesiion Bank 9 encima nnpAunnS oe Unit-IV___ Synchronization 4.1 Synchronization .... 4.2 Externally Consistent Clock 4.3 Clock Synchronization ..... 4.4 Logical Clocks... 4.4.1 Lamport Timestamp. . . 2 2 prone 4.5 Global Stott ssscsssssseeessssssusssenrosineeeiaessennninriasiansssioauenail = 14 4.6 Election Algorithm. 4.6.1 Assumptions of Election Algorithm ... 4.6.2 The Bully Algorithm. .............5 4.6.3 A Ring Algorithm ........e.essseseseseseenee Se 4.7 Mutual Exclusion 4-18 4.7.1 Centralized Algorithm s 2 S72 Deh MQM, 4 00000 ses verecerevesesnsaveraverevenaen: wos 4:73 Token Ring Ngati s ccsnsecesesouesnsexexcwsaimaneereiereneuseseare 4.8 Transactions 48.1 Transaction Model 2 Fut z 5 a 4.8.3 Nested Transactions 4-28 4.9 Distributed Transactions . 4-30 4.9.1 Transaction Implementation . ne ee 4-30 4.92 Concurrent Atomic Transactions 4.9.3 Serializability.................. Sst peso ereae eee eee se ees i SS VIL DOR ee 5.2.41 Design lesues oo... esses eee eee 8 522 rial and Hetaithical Gola es cessor nnersongcryalO=D 5.2.3 Group Membership... .......... 5.2.4 Replication and Failure Maskir 5.2.5 Agreement in Faulty Systems .. 2 a 5.3 Reliable Cliant S C ical 5-47 Earaudeaace > seat 5.4 Reliable Group Communication 5.4.2 Scalability in Reliability Multicasting 0... 0.2... eee 5.43. Atomic Multicast 5.4.4 Virtual Synchrony... 5.4.5 Implementing Virtual Synchrony andits Issues... we §5 Distributed Commit. 5-37 6.5.1 One Phase Commit. 5-37 AP fen Flee hentia eine cree arn reeset ce REN 559 Tibia COMMEISPO) seve iden er creer 5.6 Recovery..... 5.6.1 Backward Recovery. . 5.6.2 Forward Recovery... 5.6.3 Durable Stable Storage 5.6.4 Checkpointing 5.65 Independent Checkpointing. .........0cececeesevssereeseeseessesseesenee 5.6.6 Co-ordinated Checkpointing . 5.6.7 Message Logging .... | 6.1 Case Study 1: CORBA. 6.1.1 Introduction to CORBA te 6.1.2 Where does CORBA Fit in Distributed Systems 6.1.3 Components Involved in CORBA : CORBA-IDL RBA Ar 6.1.6.3Naming. . : 6.1.6.4 Synchronization... 6.1.6.5 Caching and Replication Wasa GiGi Eau Tolerate a 6.1.6.7 Security. . 6.2 Case Study 2 : Grids 6.2.2 Types of Grids... 6.2.3 Components of a Grid... eee eee eee 6.2.4 Concepts Related to Grid . 6.25 Grid Computing . 6.2.6 Examples of Grid Computing. 6.3 Case Study 3 : Cluster 6.3.1 Classification of Clusters 6.3.2 Cluster Components / Cluster Architecture . 6.3.3 Commodity Components for Clusters . 6.3.4 Cluster Administration Tools 6.35 Cluster Applications 6.3.64 High Performance Virtual Machine (HPVM) . 6.4 Future Cluster Technology. Question Bank 6 .. May/June - 2008 (I.T.).... May/June - 2006 (I.T.)... May/June - 2006 (Computer) December - 2007 (Computer) v(A = 1) to (A - 28) (A- 28) to (A- 59) Project 1 : Load Balancer Introduction Project 2 : RFS..... (B - 59) to (B - 137) Project 3 : Insurance Agent System Distributed System Approach.... (B - 138) to (B - 154) Copyrighted materia) Introduction This chapter includes Introduction to distributed systems, examples of distributed __ systems, characteristics, goals, hardware and software concepts, design issues, resource sharing jand the web, challenges. System models : Introduction, Architectural models, Fundamental models and client server models. Introduction The literature survey of distributed systems give various definitions of the distributed systems. 1. A distributed system is a collection of independent computers that appears to its users as a single coherent system by “Tanenbaum and van steen’. 2. A distributed system is one in which components located at networked computers communicate and co-ordinate their actions only by passing messages by “Coulouris, Dollimore and Kindberg”. 3. Loosely coupled systems are called as distributed systems. Loosely coupled systems are those in which the processors do not share memory. Each processor has its own local memory. The loosely coupled system is shown in Fig. 1.1. 4 Distributed Systems tre Introduction Fig. 1.1 Loosely coupled system The above definition is given by "P.K. Sinha’. «The first definition has 2 aspects. a) The first aspect says that the machines used in the distributed systems are autonomous i.e. independent of each other. b) The second aspect is related to software. The users of the machine think that they are dealing with a single system. «The second definition leads to following concepts. a) Concurrent execution Since there are n computers in the distributed environment each sharing the resources, all resources are available to all users without the feeling of interaction. The capacity of the network can be increased by adding more resources to the network. Thus concurrent program execution is an important feature of the distributed system. b) Independent failures A computer may lead to a failure. So also the computer in a network may lead to failure. If a computer in a distributed system fails, this failure is not immediately made known to other components in the system. Fault in the computer isolates it from the system, but the system does not stop running. - Faults in the network, are not made known to other components of the network. Each component of the network can fail independently, still leaving others running successfully. - Distributed Systems 1-3 Introduction In brief it can be stated that A distributed system is a collection of processors. These processors are interconnected by a communication network. In this network each processor will have its own local memory. It will also have its own peripherals. Since the processors in the distributed system may be located far from each other, the communication between them takes place by message passing over the communication network. For each processor its own resources are local and other resources are remote. ‘The distributed system gives a single system view. For this feature, a middleware service (software) is the necessity. This middleware service is placed between the higher level layer of users and applications and a lower level consisting of operating system, - This concept can be visualized as follows. Fig. 1.2 Distributed system as middleware Distributed Systems 1-4 Introduction 1.1 Examples of Distributed Systems Distributed systems are basically networked. So the following examples are basically considered for distributed systems. 1. Intranet 2. Internet 3. Mobile computing 1.1.1 Intranet Fig. 1.3 shows a typical intranet. The intranct may be composed of several local area networks. Fig. 1.3 An example of intranet The capacity of the number of nodes or computers in an intranet depends on the organization setting it up. It may be limited to a single LAN on a single site or many LAN’s internally connected together. Distributed Systems 1-5 Introduction The intranet may be connected to the internet. There may be a strong server with a high end firewall to protect the intranet from any malicious scripts coming from the internet. There may be different servers in the intranet like the file server, print server, internal email server etc. These servers provide the respective services to the nodes on the intranet. For example the file servers allow the users to share the data. A typical organization having intranet facility may be a college. The college compromises of various departments. Each department may have its own LAN setup. To internet Fig. 4.4 A typical college intranet Some organizations do not connect their internal networks to the internet. The example of such organizations are security agencies or military organizations. The distributed system where intranets may be used are * University computer network. * Workflow information system of any organization. 1.1.2 Internet In an internet different types of computer networks are interconnected. The internet itself is a very huge distributed system. Fig. 1.5 shows an typical portion of the internet. Fig. 1.5 Example of internet Some common features of the internet are + The individual users are connected to internet service providers. The internet service providers are companies that provide modem links and other internet facilities to the individual users. * The intranets may be linked by a backbone. The backbone network provides high transmission capacities. This makes fast communication between various high end servers providing services and the individual nodes on the intranet. * In an internet, the programs running on individual machines communicate with each other via different message passing mechanisms. * The internet protocols enables the distributed feature allowing any program running anywhere on the network to address messages to programs anywhere else on the network. 1.1.3 Mobile Computing This is a era of wireless networks. The devices enabling wireless features are * Mobile phones * Handheld devices like video cameras and digital cameras. * Laptops etc. Distributed Systems 1-7 Introduction Mobile computing means users who are away from their working environment may get connected to their office intranet. This connection is wireless. This connection allows them to use the official resources at the place where they are using the wireless facility. The user can connect to the internet using a + Laptop which has connection to wireless LAN. The wireless LAN connects to internet. * A mobile phone which connects to the WAP gateway. The WAP gateway connects to the internet. Thus portable and handled devices are used in distributed computing. Some more examples of distributed systems are * www * Airtraffic control * Stock brokerage systems + Banking * Distributed supercomputing, simulation * Distributed file systems/network storage/file sharing + Streaming media distribution + Event notification © Gaming * Embedded distributed systems (factory automation, vehicles). 1.2 Characteristics of Distributed Systems The main characteristics of distributed system are * Multiple autonomous components (computers) In a distributed system there are multiple components or computer. Each computer is an individual autonomous node having its own software and resources. It can also share components and resources from the distributed environment. * Components not shared by all users Since a distributed system is a collection of autonomous computers linked by a computer network that appear to the users of the system at a single user, the user might not share some of the private resources in the distributed environment. So also, some of the resources may not be accessible by all the users. Distributed Systems 1-8 Introduction Software architecture In a distributed environment the software runs in concurrent processes on different processors. The software architecture and the operating system must be such that it necessarily supports this feature. Therefore the distributed systems must possibly have heterogeneous machines and networks or clusters of workstations or servers. Modern operating systems like UNIX or NT allows an easy construction of a basic clusters of workstations. Multiple management points The important characteristic of a distributed system is that there are multiple points of control. For example : Automatic banking. Controlling or teller machines Fig. 1.6 In a distributed environment of automatic banking the primary requirements are security and reliability. Since there are multiple points of control, it is a challenge to maintain consistency of replicated data. Also, concurrent transactions which have operations involving accounts in different banks is an important feature. Distributed Systems 1-9 Introduction Since there is simultaneous access from several users, there may be multiple points of failure. So again fault tolerance is an important characteristic of any distributed system. 1.3 Goals of a Distributed System Four important goals to be achieved by any distributed system are Easily connect users and resources. * Exhibit transparency. * Support openness. * Be scalable. We will see each of these goals in detail. 1.3.1 Easily Connect Users and Resources With the tremendous progress in the IT industry, a rapid need for sharing resources is raised. Sharing of resource has its basic pros and cons. + The pros could be for eg. : A printer is shared among various computer nodes. This leads towards an economy approach. * Sharing the web pages and high end server facilities makes it possible to be in touch across the globe communication at far distances become cheaper and faster. + Sharing makes widely spread people to work under a single group. Internet high end sharing facilities lead to services like teleconferencing, collaborative editing video conferencing etc. © Sharing through the internet has provided a strong facility like e commerce. All the business transactions could be done staying at their own place without actually moving to the physical location. Buying and selling of goods has also become a simplified procedure, * Looking at the cons part of the story. Connectivity and sharing property has lead to the aspect of security. Security issues are the major concerns of any project involving a network. Intruders and hackers have tremendously increased in the upcoming IT era. Sharing makes it possible to share passwords and other sensitive information thru the network. It has also led to unwanted information passing like the junk mail or spam. Distributed Systems 1-10 Introduction Thus the main aim of any distributed system is to easily connect users and resources. The system must handle the security issues with strong servers and high configuration firewalls. It should also provide the unnecessary spreading of the junk data. 1.3.2 Exhibit Transparency A transparent distributed system looks to its users as if it were only a single computer system. The processes and resources in the distributed system are physically spread but the goal of the distributed system is to hide this fact. There are different types of transparencies in an distributed system. Access transparency There are various ways of representing data. The same data may be represented in two different formats by two companies. A distributed system with access transparency feature allows to access the data hiding the differences in data representation. For example : In some format of representation. For example, little endian higher order bytes are transmitted first, whereas in some formats like big endian lower order bytes are transmitted first. The D.S. takes into consideration these format differences and hides this feature letting the user to work in his own data format. The distributed systems have computers where different operating systems may be installed. For example, some machine may have windows XP while the other may have UNIX OS. There is a vast difference in the file naming conventions of both the operating systems. The DS. takes care of these difference and provides the access transparency feature to the users of both the O.S. machines. Concurrency transparency Resources are shared in a distributed environment. This may lead to sharing the same resource by two users at the same time. For example, two users may be accessing the same file at the same time. Changes made by one person should not be known to the second person at the same time. This is called as concurrency transparency. Another example may be of a database. A same table may be being used by more than one users at the same time. The main feature to be handled in concurrent access is that the data after access must remain in a consistent state. There are features like locking mechanisms and transaction which helps in keeping the data in a consistent format. Distributed Systems: 4-71 Introduction * Location transparency The users of the distributed systems are accessing the resources without knowing their physical location. This feature is called as location transparency. For example www.google.co.in gives no idea where exactly the google server is slaced, but the required information from the google's server can be obtained by yping the url. Thus location transparency is achieved by assigning logical names to the resources. The logical names, in no sense indicates the physical location of the resource. * Migration transparency Resources in the distributed environment are remote. Accessing methods only needs the logical names of the resources, physical location of the resources is immaterial. So the physical movement of the resources without affecting the way to access the resources is called as ‘Migration transparency’. * Relocation transparency The resources in the distributed environment are accessed even when they are relocated. This is an important feature and is called as relocation transparency. For example, if a user is working on his laptop and he has a wireless net connection. Though the person is constantly moving, still his machine is in the distributed environment on the network. * Replication transparency Since the data is to be shared among various users, several copies of the data exists. As we have seen in the example of automatic banking that several copies of the data do exists. But the replicated data is unknown to the user as it is kept consistent by the distributed environment. The feature that the users are being unaware of the replicated copies of the data is called as replication transparency. * Failure transparency There are multiple management points in the distributed environment. There is simultaneous access from several users. There may be multiple points of failures. The beauty of a good distributed system is to make the user unaware of the failure of any resource till the system is subsequently recovered from the failure. This feature is called as failure transparency. «Persistent transparency This feature allows the user to be unaware of the fact that the server is moving state between primary and secondary memory. This movement of the server is needed is some object oriented databases where direct methods stored on objects can be invoked. Distributed Systems 1-42 Introduction Degree of transparency The transparency is not always desirable. Since the users are located in different continents they have a certain time difference. There may be situations where all the distribution aspects are not desired. If a person wants to read a particular newspaper everyday at 6:00 p.m. If he travels to the other part of the globe the same time settings will not be effective and hence the transparency aspect is not desirable in this case. Similarly, it is not always possible to hide the failures. For example the data should remain consistent in a replication environment. Any change done to the data must be made to all the copies of the same data. This requires atleast some fraction of seconds and the user becomes aware of it. For example, when we are viewing a live score on the internet, change in the score takes a considerable amount of time and the user knows that now the data is changing. Thus there is a trade-off between high degree of transparency and the performance of the system. Second example may be a failure of the server. The user continuously tries. For example, www.abe.com but with no response indicating that abc's server has a breakdown. Thus failure transparency may not always be exhibited in the distributed environment. So when designing a distributed system the goals of the distributed system must be achieved taking into account the performance measures of the system. 1.3.3 Support Openness An open distributed system offer services according to standard rules that describe the syntax and semantics of these services. The openness of a system is the characteristic that determine whether the system can be extended and reimplemented in various ways. For example, to send and receive a message in a network, there are certain message formats which are to be followed. These formats are rules are being formalized in the protocols. So also, in a distributed system there is a Interface Definition Language (IDL). The IDL specifies « Names of the functions that are available. * Return values of each function. « Types of values or parameters to be passed to each function. * Any exception to be raised on a specific condition ete. Distributed Systems ~ 1-13, Introduction Thus open systems are characterized by the fact that their key interfaces are published. They are based on the provision of a uniform communication mechanism and published interfaces for access to shared resources. Advantages of open distributed system © Interoperability : Open systems can work together. This feature allows two implementations of system or components from different manufacturers to co-exist and work together. Both of them rely on each others services or specified by a common standard. * Portability : Portability is an ability to transform an application from one software or hardware platform to another. Goals of open distributed systems * Flexibility : Different developers provide different components. So a flexible open distributed system is one which allows easy configuration of the system with different components from different developers. «Extensibility : Secondly a flexible open distributed system allows to add new components to the system, replaces the existing ones. This is done without affecting those components. Those are in their original place. The flexibility is achieved by organizing the system as a collection of small and easily replacable components. There should also be a clear separation between policy and the mechanism used. For example in web caching, the user is provided with the facility of storing the documents. In a flexible open distributed system where web caching feature is used, the user should also be able to decide which documents are stored and for how long. Practically, this can be achieved by offering the users with a huge set of parameters which can be dynamically changed. 1.3.4 Be Scalable An important goal of a distributed system is to be scalable. According to Neaman 1994, scalability is measured along three different dimensions. © Scalability with respect to size : This means that in a scalable distributed system, users and resources can be easily added to the system. 4 Scalability with respect to geographic area : In a scalable distributed system, the users and the resources may lie geographically at different locations. But still a communication between them is flexible and addition of new users and new resources is possible. Distributed Systems 1-14 Introduction * Scalability with respect to administration : A distributed system with this feature allows easy and proper management of the system even if the system is widely split amongst amongst various organizations. 4.3.4.1 Problems with Scalability There are various problems to be solved if a distributed system is to be properly scaled. Various types of problems arise in different types of scalability. * Problems in scalability with respect to size a) In a distributed environment many users are operating at the same time. If there is a single server, then the needs of all the users are to be satisfied by the single server i.e. the services are centralized. If many users are logging on to the server at the same time to get the service, there may be a blocking state to some of the users. Even though there is logically no limit on the number of users, but still providing the service may need communication, message transfers, data exchange ete. This may lead to delay for some users whereas a complete blocking state to the user who is:last in the queue. While designing an distributed system, this feature must be taken into account. b) A highly secure data like bank accounts, credit card details must be confidentially maintained on the single server. If the number of users accessing the data from the server increases, more servers are designed. Replication of the same data breaks the confidential factor of the distributed system. The data may now be prune to many vulnerable attacks. Thus keeping centralized service and centralized data both are a difficult task. If 1uge amount of data is to be stored, the capacity of the high ended servers becomes wn issue, For example, electric bill of all the customers in India if decided to be entralized, how do 10000 million records be stored on a single server ? c) Ina distributed system, it is a better practise to adopt decentralized algorithms. This is because the machines in the distributed system are constantly communicating with each other sharing the data and resources. For example, It is decided to collect the load distribution on each machine. Since many messages are routed over various routes, a graph algorithm can be used to find the shortest path. But the algorithm itself would overload the network by passing messages, then processing them and finally making the decision. So it is a good practice to have decentralized algorithms. In a decentralized algorithm, no machine has complete information about the system state. There is no assumption of the existence of a global clock. If any machine feature, the algorithm is not ruined. In such an environment, the machines make decisions based only on local information. Distributed Systems 15 Introduction «Problems in scalability with respect to geographical area a) Geographical scalability says that communication between geographically apart users must be flexible. Current distributed systems support synchronous communication. In synchronous communication the client sends a request to the server and remains in the block state till the service is provided. This is shown in Fig. 1.7 Fig. 1.7 Synchronous communication This approach is fine in LANs but difficult in an distributed environment b) Ifa service is required by a’particular user, then if it is a local area network a message could be broadcasted by enquiring about the name of the service provider but this is a difficult task in the distributed environment. This type of communication in distributed system will impose limitations in geographical scalability as the performance will be degraded and reliability will be on issue of concern. * Problems in scalability with respect to administration Scalability in an administrative domain is an difficult task to be achieved by any distributed system. Components in an distributed environment does not necessarily reside within a single domain and hence leads to a difficulty in administration. If the components are within a single domain are trusted by the users of the same domain. Mostly preventive measures are already taken by the respective organization handling those components. However such measures cannot be adopted when the domain boundaries are crossed. «Two types of security measures are required in case the domain boundaries are crossed. 1) Protection has to be done from malicious attacks from other domains. 2) New domain has to protect itself from the malicious attacks of the distributed systems. Thus scalability with respect to administration is an important issue to be handled by an distributed system. Distributed Systems 1-16 Introduction 1.4 Scaling Techniques In the distributed systems the scalability problems are actually performance problems. These problems arise because there are limited number of servers and network. There are three techniques for scaling. Hiding communication latencies « Distribution + Replication The techniques must basically satisfy scalability issues. i.e. scalability means that the system should remain efficient with a significant increase in the number of users and resources connected, a) The cost of adding the resources must be reasonable. b) Performance loss with increased number of users and resources must be controlled. c} Software resources must not run out. We will discuss each of the technique in detail. 1.4.1 Hiding Communication Latencies This technique is used when geographical scalability is an issue of concern. In distributed systems which are geographically far located, the communication is mostly synchronous, ie. the client waits till the server responds. ‘This technique says that we should avoid the waiting for the servers response. This time could be utilized for some efficient work i.e. the system should necessarily use asynchronous communication. In this type of communication, the requesting application does not wait for the reply from the server. When a reply comes in, the application is interrupted. The previously issued request is completed by a special interrupt handler which is called on receipt of the reply. Distributed Systems 1-17 Introduction This type of technique is now widely used in internet applications where a client is asked to fill up a form Previously a server used to check the form. Now java applets are written in such a way that the client only does the checking and hence communication time is reduced. a) Previous processing ‘Sent to server, Client Fig. 1.9 b) Now, using java applets Fig. 1.10 1.4.2 Distribution As its name, this technique involves selecting a component and then splitting it into smaller parts. These smaller parts are spread around the system. Following are the examples. Distributed Systems 1-18 Introduction « Web : To a user, the web has n number of documents to be processed. This n is a very large number. Practically, there are several_servers handling a particular amount of documents. The documents URL consists the name of the server handling that document. The web appears to be a single server because of this distribution technique-of scalability. * DNS : The Internet Domain Name System (DNS) is another example of distribution. The organization of DNS is done by hierarchically dividing it into tree of domains. These trees are divided into nonoverlapping zones. Each zone has a single name server and the names in each zone are handled by this server. Fig. 1.11 DNS name space into various zones : An example ece.b123.c123 name is resolved by first passing it to the server of zone 1, then passing it to the server of zone 2 and finally passing it to the server of zone 3 and thus address of the associated host is returned. 1.4.3 Replication Since improper scaling results in performance degradation, it is a better practise to replicate the components in the distributed system. Replication leads to availability of the components to the users at a faster rate thus improving the performance of the system. 1.5 Hardware Concepts A distributed system consists of various processors (CPUS). There are various ways of organizing these processor i.e. the ways the various processors are connected varies. Over the years, many classification schemes have been proposed. These schemes are basically based on the architecture of the interconnection network. Distributed Systems 1-19 Introduction There are two commonly used architectures * Bus based architecture : In this type of architecture there is a single network and a bus cable that connects all the machines. A common example of bus based architecture is cable TV. The cable company has a single bus running around and all the cable subscribers have a socket or a tap to this cable from their television set Fig. 1.12 (b) Private memory bus-based organization * Switch based architecture : In the switch based architecture, multiple individual wires run from machine to machine with different writing patterns. An example of this type of architecture is the public telephone system. In this system, messages move along the wire. At each step the switches are moved to decide the route of the message. Distributed Systems 1-20 Fig. 1.13 (b) Private memory switch based architecture Distributed computer systems are broadly classified into two types. D Multiprocessors - Multiprocessors are those that have shared memory. 2) Multicomputers - Multicomputers are those that do not have shared memory. ‘They are further divided in two subclasses. Homogeneous multicomputers «Heterogeneous multicomputers We will see each of the above in detail. Distributed Systems 1-21 Introduction 1.5.1 Multiprocessors Multiprocessors are those that have shared memory. In a multiprocessor each CPU has direct access to the shared memory. In a bus-based architecture, there are number of CPU's and all are connected to the common bus. All have access to shared memory. * Coherent memory Since there is only one memory, if a CPU x writes the word to memory and CPU y wants to read back the word after some seconds, it gets the newly written word. This property of the memory is called the coherent property. Problems with coherent memory : If there are many CPU's in the system, the bus will always be overloaded. Solution : The simplest solution to this problem is to have a cache memory in between processor and memory. If the processor request for a word and if it is available in the cache memory, it returns the word faster. This increases the efficiency of the system. Cache holds the most recently accessed word. Hit rate : The hit rate is the amount of time on which a word is requested by the processor and it is found in the cache. Large size of cache memory results in higher hit rate. © Incoherent memory : When a cache memory is used there is a major problem that occurs. If two processors x and y both read the same word into their cache then A over writes the word. After a few microseconds, when B reads the word, it gets the old word which is available in its cache memory. This is called an incoherent memory. Problems with bus-based multiprocessor and its solution © It results in limited scalability when a cache memory is used. Solution to this problem is the crossbar switch. Divide the memory into modules and connect them to the CPU by crossbar switch. Fig. 1.14 shows the architecture where each processor and each memory has a connection coming out. * There are several intersection points. A crosspoint switch can be used at every intersection point. This switch can be opened and closed in the hardware. Distributed Systems 1-22 introduction Crosspoint switch Fig. 1.14 A crossbar switch Advantages of crossbar switch When a CPU wants to access a particular memory, the crosspoint switch between them is closed to allow the memory access. ‘Thus many CPU's can access memories at the same time, and traffic as in the case of bus is not an issue. But if two CPU's are accessing the same memory simultaneously one of them has to wait. Disadvantages of crossbar switch Since there are many processors (n) and many memories (n) in the distributed system, it results into n? crosspoint switches. An example where an alternative switching method is used in the omega network. The omega network shown in the Fig. 1.15 consists of 2x2 switches ie. each switch has 2 inputs and 2 outputs. Now also with proper settings every processor can access every memory. To avoid low latency problems between CPU and memory, switching has to be tremendously fast. This may also lead to overhead cost. Distributed Systems 1-23 Introduction Fig. 1.15 An omega switching network 1.5.2 Homogeneous Multicomputer Systems Homogeneous multicomputer system is also called as System Area Network (SAN). In these types of systems, the nodes are mounted in a big rack and connected through a high performance single network. © Bus-based multicomputers In this type of systems shared multiaccess network like an ethernet is used to connect the processors. Messages are broadcasted in bus-based multicomputer systems. * Switch based multicomputer Instead of broadcasting, in a switch based multicomputer messages between the processors are routed through interconnection network. Two commonly used topologies are a) Grid : Grid structure is shown in the Fig. 1.16. They are easy to lay down on PCB's, Fig. 1.16 A typical grid structure Distributed Systems 1-24 Introduction The grid structure is mainly used in applications that have a two dimensional nature. Applications like graph theory and robotics (visions) are the areas where a grid structure is used. b) Hypercube Fig. 1.17 shows a four dimensional hypercube. It could be an n dimensional hypercube. In an hypercube each vertex is a CPU. Two CPUs are connected by an edge. The hypercube can be expanded by adding another set of interconnected cubes in the Fig, 1.17. Fig. 1.17 Four dimensional hypercube 14.5.3 Heterogeneous Multicomputer Systems A system in which the computers forming part of the system vary in processor types, memory sizes, 1/O bandwidth etc are called as heterogeneous multicomputer systems. Most of the distributed systems that are used now a days are built on heterogeneous multicomputer systems. The interconnection network used may also be heterogeneous. 1.6 Software Concepts It is not only the hardware that is important for a distributed system, but the software also plays a major role in the system. Distributed systems exhibit the properties of operating systems. * It hides the differences of the various machines and networks in the distributed system just like O.S. hides the features from the user. ¢ Distributed systems, like O.S. acts as resource managers for the hardware. It allows various users and applications to share the resources. So, we will study the operating systems for the distributed computers. There are basically two types of operating systems. Distributed Systems 4-25 Introduction 1) Tightly coupled system : In tightly coupled system, a single global view of the resources is being maintained by the operating system. This is also called as "Distributed operating system’. 2) Loosely coupled system ; It is a collection of computers each having their own operating system. A brief description of * Distributed OS. + Network O.S. and + Middleware is given below. 1.6.1 Distributed Operating System The distributed operating systems are classified in two types : 1) Multiprocessor operating system It manages the resources of the multiprocessor. Main goals of multiprocessor operating system are © Support high performance through multiple CPU's. © Make the number of CPU's transparent to the application. In a distributed system, the data is shared at various locations. The job of multiprocessors OS. is to protect the data during simultaneous access. Two important primitives that are used for protection are synchronization primitives. They are i) Semaphores Semaphore is a integer which has basically two values a 0 and a 1. A 0, blocks the calling process whereas a 1 value of the semaphore checks the blocked processes and then unblocks one of the process and then continues. Once the semaphore operation is started, no other process can access the semaphore until the operation is completed. ii) Monitor A monitor is like a programming language construct. It is a module consisting of variables and procedures. Monitors procedure can access the variables. A monitor allows a single process at a time to execute. For example, if a monitor consists of procedure x which is executing and a procedure y also call monitors procedure, then y is blocked till procedure x is completed. Distributed Systems 1-26 Introduction 2) Multicomputer operating systems Muiticomputer operating system does not provide the feature of shared memory. Therefore the only means of communication is message passing. The organization of multicomputer OS. is shown in Fig, 1.18. Distributed applications. Kernel Fig. 1.18 Multicomputer 0.8. In multicomputer OS. the local resources like CPU, memory and local disk are being managed by the nodes or machines own Kernel. The machine also has a separate module for interprocessor communication. 1.6.2 Network Operating System The network operating systems consists of various uniprocessor systems. Each uniprocessor system has its own operating system. ~ Fig. 1.19 Distributed Systems 1-27 Introduction Since each processor has its own operating system, every processor may have different OS. and different configurations for the machines. But all these machines are connected to the network. Like the distributed systems, the network operating system does not assume an homogeneous set up, i.e. it does not assume the whole set up as a single machine. But in a network system also there are facilities which allows the services of the remote system. Some commonly provided services of the network system are 1) Remote login : A user on one machine can login as a user into another machine. The rights of the guest user logging on to the machine are decided by the administrator of the host machine. Once the guest user has logged in, it can access the services and data of the machine as per the assigned rights. 2) Remote copy : This facility allows a machine to copy data files from one machine to another. 3) Client server communication ; The network system can also have file server facilities. There may be a shared global file system. The file system may be stored on some machines. These machines are called as file servers. There may be several file servers. > Disks on which the file server Fig. 1.20 File server There are various users running in a network system. The users called as clients sends requests to the file servers. The requests may be read or write requests. The server examines the requests and sends a reply accordingly. Fig. 1.21 Client server communication Distributed Systems 1-28 Introduction Disadvantages of Network 0.S. * Lack of transparency Unlike distributed systems, the network operating system does not provide the transparency feature i.e. the system does not appear as a single machine to its user. The processes and resources which are spread out may be accessible by certain remote accessing features. Normally, the client server type of communication is preferred in different networks. This type of communication is normally harder. The data could also be accessed by remote login or remote copy features as explained above. * Management difficulty In a network system all machines work independently. If there is a problem in machine A, then it has to be solved from machine A only. It can also be solved by remotely logging on to machine A. But this may give rise to the third type of difficulty normally arising in network systems. «Maintaining access permissions Every machine in the network has to maintain the access permissions of the users logging on to the system. * Multiple passwords Suppose a user has a right to access 10 machines in the network ie. if he can login into ten different machines, then he has to have 10 passwords for entry. He, by his choice can have a single password for logging on to all machines, but this password has to be set by all the administrators of the 10 machines. * Malicious attacks on the system In a network, there is always a possibility of remote malicious attacks. Intruders may log into the system and lead to a mishap. Advantages of network system It is not the case that the network systems have only the disadvantages, there are various advantages over the distributed systems. * Independent nodes : The machines in the network system are independent nodes. Any failure in any node does not hamper the whole network, only that particular machine is bypassed. Repairing and solutions to hardware and software failures are comparatively easy Distributed Systems 1-29 Introduction + Adding and deleting a node is easy Since all the machines in the network are independent, if a machine is to be added to a system, the only thing to be done is to connect it to the network and make it known to other machines on the network. Recognition across the internet is easy If a new machine is added to the network, it is also easy to make its recognition across the internet primarily this can be done by i) Assigning an IP address to the machine. ii) Adding the machine into DNS along with its IP address. 1.6.3 Middleware Middleware is an additional layer of software which is used in the network systems. This layer has the beautiful feature of hiding the heterogeneity of the systems in the underlying network. This layer tries to overcome the drawbacks of distributed systems and network systems. * A distributed system cannot handle a collection of independent machines. * A network system does not provide the view of a single system. If a distributed system is designed with the middleware feature, the above drawbacks are overcome. Important features of middleware Middleware is connectivity software. It consists of a set of services that allow multiple processes running on one or more machines to interact across a network. * It make the task of designing, programming and managing distributed applications by providing a simple, consistent and integrated distributed programming environment. * Since middleware does not manage individual nodes every local system's OS does the job of resource sharing. * Sub areas of middleware are Reflective middleware Techniques from the field or reflection are used in this type of middleware. These technique are used in order to achieve flexibility and adaptability in middleware platforms. Distributed Systems 1-30 Introduction Event-based middleware Basically the event based systems use event-based middleware to explain the concepts, design, implementation, and application of services and components that support the event-based systems. Event based middleware is particularly used during the construction of non-centralized distributed applications specially those applications that must monitor and react to changes in their environment. Examples are process control, internet news channels and stock tracking, Object-oriented middleware Object-oriented middleware extends the object-oriented programming paradigm to distributed systems. In this type of systems, the applications are structured into objects that interact via location transparent method invocation. Examples of this type of middleware are the OMG's CORBA and Microsoft's Distributed COM. Both of these platforms offer an interface definition language (IDL). In such types there is an object request broker. The main feature of this browser is that * It transparently directs method invocations to the appropriate target object, and a set of services (e.g. naming, time, transactions, replication etc.). This further enhances the distributed programming environment. Message-oriented middleware Message-oriented middleware is the concept used for communication in the network OSL model. Fig. 1.22 shows the basic block diagram of distributed system as middleware. It has the following layers. 1) Applications 2 Domain specific middleware services 3) Common Middleware services 4) Distributed middleware 5) Host infrastructure middleware 6) Operating systems and protocols 1) Applications Fig. 1.22 shows that various applications interact with the domain specific application services. 2) Domain specific application services These are used for the requirements of particular domain like healthcare, process automation, aerospace, e-commerce etc. Distributed Systems 1-H Introduction Domain specific application services Common middieware services Distributed middleware Host infrastructure middleware S rating systems > Operating systems > fos ing ove Hardware devices Fig, 1.22 Block diagram of distributed system as middleware Distributed Systems 1-32 Introduction Examples of domain specific application services are : * The OMG has convened a number of domain task forces. These forces are concentrating on standardization of domain specific middleware services. ¢ The Boeing Bold architecture user distributed architecture involving hardware and software with middleware. It provides the features like i) Navigation ii) Display management iii) Sensor iv) Data link management ete. * The seimens medical engineering group has developed syngo. The syngo middleware provides services like i) Mammography ii) Radio graphy iti) Cardiac systems iv) Therapy systems etc. 3) Common middleware services Because of these services application servers no longer needs to write code that handles features like i) Transactional behaviours ii) Security Database correction iv) Threading etc. 4) Distributed middleware Distributed middleware enables clients to program distributed applications like standalone applications. This is done by invoking operations on target object without hardcoding dependences on their location, programming language etc. At the heart of distributed middleware there are request brokers. Some examples of request brokers are : i) The OMG's Common Object Request Broker Architecture (CORBA). ii) Sun's Java Remote Method Invocation (RMI). iii) Microsoft's Distributed Component Object Model (DCOM) etc. Distributed Systems 1-33 Introduction 5) Host Infrastructure Middleware It uses mechanisms to create reusable network programming components like i) Reactors ii) Monitor objects iii) Active objects iv) Acceptor connectors ete. These components help the programmers to keep away from APIS and threads. 6) Operating systems and protocols The systems have various OS. and their protocols providing various features. Finally through OS. the component are connected to actual hardware, 1.7 Design Issues in Distributed systems «Transparency i) Simplifies programming. ii) Complete transparency impossible/ undesirable. * Openness i) Well-defined interfaces and protocols. ii) Ensures innovation, competition, flexibility. * Scalability and Performance i) Leverage distributed state, administrative distribution, hardware performance. ii) Must distribute state, computation, minimize communication. * Security | i) Authentication ii) Access control iii) Privacy iv) Integrity. \ © Reliability i) Fault tolerance | The first three features transparency, Openness and Scalability and performance are already explained in the above topics. Reliability aspect is covered in detail in the above sections. Distributed Systems 1-34 Introduction « Security is explained in brief in this section. 1.7.1 Security We will discuss the aspect of security in the distributed systems in this section. According to Lamport, a distributed system is one where the crash of a computer that you have never heard of stops you from getting any work done. In this section we will look at the following points in brief. 1) Introduction : What is security ? 2) Goals of security 3) Design issues of security 4) Security standards. 1, Introduction It is important to secure any systems. Security deals with three basic aspects. * Stopping the bad users from breaking into the systems i.e. take preventive measures so that system is not attacked by any intruder. * Even after taking preventive measures, if the intruder breaks into the system, then there must be provision to trace the user so that he is punished accordingly. * Security is done so that there is less interference with daily life then value of loss. Security becomes an issue of concern because there are certain Vulnerabilities in e system. * There is bad programs ie. programs having bugs which leads to insecurity. © There are Bad People destroying the security of the system. For example, Hackers. They may give instructions to good programs destroying the security of the system. * There are bad people corrupting communications. They may hack the data during communications or transfer. * There are threats those are the adversaries that can and want to exploit vulnerabilities. Security of any system is an expensive offence. This is because people behave in a wrong way. There are three elements of security * Policy : It specifies the security of a system. ie. it specifies that in a particular system following are the security aspects to be considered. ie. What is the secured system supposed to do ? Distributed Systems 1-35 Introduction Mechanism : This deals with the actual Implementation of the security measures i.e. How does a secured system work ? Assurance : It gives the correctness of security, ie. Does the system really work in a secured fashion ? 2. Goals of Security Normally four goals of security are the targets of any secure system. Secrecy : The secrecy of the information in the system should be maintained. i.e. control of the information in the system is the basic goal of security. Integrity : The information in the system is normally shared and is used by various users in various ways. Integrity aspect deals with controlling how information changes or resources are used. Availability : A secure system must provide prompt access to data, information and resources to the valid users. There should be no delay in the transmission due to the security constraints. Accountability : The secured system should know who has had access to information of resources. Certain defensive strategies can be followed to have a secure system. Isolate : Simply keep everybody out of the secured data. i.e. Disconnect the machine on which secure data lies, This is practically not possible. Exclude : Keep intruders away from the system by having strong firewalls and code signing mechanisms. Restrict : This allows the intruders to get into the system but prevents them from damaging the system. This is the toughest tasks of security. Some techniques used for achieving this are access control and sandboxing. Recover : This technique allows for the recovery of the loss of data. It helps with integrity. This can be keeping backup systems and certain restore points from where data can be restored back. Punish : Learn techniques like steganography and get the intruders and punish them, 3. Design Issues of Security Security is an important aspect in the distributed system. Main issues in the design of security systems are explained in brief below. Focus of control Security mechanisms Simplicity Distributed Systems 1-36 Introduction ° Focus of control There are various ways in which the protection against security threats can be done. a) The first way deals with the protection of data against wrong operation. Data is protected against wrong or invalid operations. This type of protection is mostly done in the database management systems. b) The second method takes care of the data against unauthorized invocations. Foe example, some clients are given access to some data and others are not, so proper access control mechanisms are employed, so that only proper users access the data. ©) Data is protected by checking the role of the invoker. Thus the focus of control can be on Protection against invalid operations. Protection against unauthorized invocations. Protection against unauthorized users, Fig. 1.23 Security related to focus of control © Security mechanisms Every network has a layered model. So also a distributed network is layered. The security mechanisms can be provided in various layers. The layered model of distributed system looks somewhat as shown in Fig. 1.24. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 1-39 Introduction * In an distributed environment, there may be specialized servers which are shared by all the users. These server provides certain necessary services required for smooth functioning, of the system. * There are active servers and passive clients in the network. The clients requests services from the server and the server responds back with messages. There has to be an effective way of interprocess communication Interprocess communication is explained in detail in chapter 2. + The world wide web is a very big sharing media. We will discuss world wide web in detail below. World Wide Web : An big international network containing information on virtually any topic is a world wide. An internet is a system compromising of fibre optic cables, routers and computer servers. The computers are linked through the internet. There are thousands of web links throughout the world. The network is grouped by various domains specifically. * Geographical domain Functional domain Educational domain Government domain ¢ Commercial domain. The site addresses of various servers can be given by their geographical placement across the globe. For example, International addresses can be identified by the country suffix. France -fr Great Britain -uk Germany -de Norway -no etc. The educational domain sites have the extension edu. The government domain (gov) and the commercial domain (-com) is gaining lots of popularity in the current era. The world wide web is to be accessed via proper softwares commonly used softwares are : © Internet explorer and * Netscape Navigator. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 1-43 Introduction * Redundancy The failures could be tolerated and the services could be made available to the users by adding some redundant components in the system. For example, i) There could always be one backup server. ii) Data can be replicated on various servers. If there is fault on one server, immediately a replacement could be done. In a distributed environment, if one of the components fail, only the work that is involving a failed component is affect. Rest remains unaffected and the working continues. 41.10 System Models The system models of the distributed systems are classified in three parts. 1. Architectural models 2. Fundamental models 3. Client-server models The explanation of each of the models is given below. 1.10.1 Architecture Models In a distributed system, the architectural model is concerned with the placement of different parts of the system and the relationships among them. Since there are various components of the distributed system, they must be structured in such a way that the present and future demands of the system are met. So a architectural model will first simplify and abstract the individual components of the distributed system ‘Then it will consider the placement of the components in the network and finally it will also consider the interrelationship between the components. There are basically three architectural models for the distributed systems. A) Workstations/ servers model ; B) Processor pool (thin client) model ; ©) Integrated model. We will elaborate in brief on all the above models. A) Workstation Model The main components of the workstation models are a) Workstations b) Application programs c) Various servers d) e) Unique Use ID f) Filestore. Interface Distributed Systems 1-44 Introduction In a workstation model, each user has a workstation. There can be a network of workstations or a cluster of workstations. The application programs run on the workstation. Various services are to be provided by the distributed system. Services may be like managing the directory, authentication, news, printing, gateway, mail, etc. There are specialized servers who are responsible for giving the designated services in the workstation model. The workstations may share a common set of resources or a common interface. They may be in the integrated fashion. There is a network of workstations but a user ID is unique across the entire network of workstations. Any user is allowed to use any workstation. The filestore in the distributed systems must be system wide and this is a mandatory aspect of the distributed system. Some of the workstations may have their own file stores but due to transparency feature of the distributed systems, the files le to other users of the system. must be acce: A user can run the application programs remotely on other workstations. If a cluster management system is used then the user can submit the jobs transparently to the network of workstations. Since there are many jebs running in the distributed environment the cluster management system handles the resource allocation, scheduling and queueing aspects of the jobs. ‘Common interface Cluster of workstation Fig. 1.27 Workstation model 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems Introduction 1.10.2 Fundamental Models Some of the properties are common in all architectural models. The fundamental models are concerned with the formal description of such properties. Depending on the properties, the fundamental models are classified as A) The Interaction Model B) The Failure Model C) The Security Model. Every mode! which is designed should take care of the following points. + It must necessarily specify all the entities in the system. + It must also specify the necessary means and ways of communication among various entities. « It should also mention the characteristics which will affect either the individual or the collective behaviour of the entities. * The model should explicitly mention all the assumption made about the system being considered. * The model must provide means to make all generalizations based on basic assumptions. We will elaborate on the various models below. A) The Interaction Model The interaction mode! deals with the interaction feature of the distributed system. In the system, there are various processes. These processes communicate with each other by various communication mechanisms. They send and receive messages among each other. The interaction model must be designed in such a manner that the communication between various processes is not delayed. In a distributed system there are various ways in which communication can take place. * There may be various servers providing specific services. For example, a) Various file servers b) DOMAIN name server ¢) Network information service provider. * Many processes may communicate with each other for specific purpose. For example, video conferencing. Distributed Systems 1-48 Introduction The processes this system communicate with each other in the real time environment. Various distributed algorithms are designed to enhance the communication between various processors in a better sophisticated manner. Since communication is the basic feature in the interactive model, its performance evaluation is a matter of concern Various issues related to performance of communication channels are : a) Network traffic The network traffic leads to delay in transmission and reception. b) Bandwidth Bandwidth deals with the total amount of information that can be transmitted at a given time. If there are various processes sharing the same bandwidth, then it may lead to delay of transmission and reception. ©) Jitter When a series of messages are to be transmitted, there may be variation in the time taken to send each part of the message. This is called as delay. For example, if a sound file is to be transmitted and there is variation is transmission of the samples, it results in noise. So, the performance of a communication channel is an important issue to be handled in interactive models. There are two types of interactive models : i) Synchronous models and ii) Asynchronous models. i) Synchronous Model In a synchronous model, strong assumption about time is made. * It is assumed that every processes execution time is known perfectly with lower bounds and upper bounds * It is also assumed that any message transmitted over a channel takes a specific amount of time. A synchronous model of a distributed system can be built if the processors are guaranteed with sufficient processor cycles and network capacity. Asynchronous Model It is very difficult to design an synchronous distributed system with strong time assumptions. An alternative model in the asynchronous model. In such type of models there is no bound on Distributed Systems 1-49 Introduction * Process speed. © Message transmission time. B) Failure Model The failure model of a distributed system must be capable of handling, all the failures. «Failure handling In a distributed system, there are various processes and many communication channels. Hence there may be failures in processes or communication channels or both. Three common types of failures occurring in a distributed system are : * Omission failures + Arbitrary failures and © Timing failures We will discuss in short about the above failures. © Omission failures In the processes or the communication systems fails to perform actions that it is supposed to do, then this type of failure is classified as omission failure. There can be i) Process omission failures and ii) Communication omission failures. i) Process omission failures The process omission failure is said to occur when a process crashes. A crash is said to occur, when a process halts and does not respond for any request. Other processes running in the system may be able to detect the failure. It may also be possible, that the process sending request to failure process may wait assuming that the process is slow and be in the waiting state for quite a long time. By this way, it is difficult to detect a process crash. If the processes can certainly detect that a process has crashed, then it is called | a top crash. This method requires a time out to come up to a certain | decision. For example, Distributed Systems 1-50 Introduction No response till maximum time limit Fig. 1.30 Fail-stop crash ‘The Fig. 1.30 shows the description of a fail-stop crash. If a process P is requesting process Q for a certain service, process P goes in the waiting state. If, after a certain time equal to the maximum allowable time (the timeout), there is no response from Q, then P can conclude that a fail-stop crash has occurred for process Q. ii) Communication omission failures A communication omission failure is said to occur if there is loss in the message transmission process. Fig. 1.31 Communication omission failures For example if a process A wants to send message to process B, the transfer is through a buffer. Process A sends the data to buffer A from where it is transmitted to buffer B of process B. There may be a loss of data in the communication channel. If the data is lost from buffer A, ‘Send omission failure’ occurs. Similarly, if the data is not received in buffer B, "Receive omission failure’ is said to occur. Handling of failures is very difficult in a distributed system. The following are the features related to failures. © Tolerating failures If a system is big enough for example, an internet, its clients are designed in such a fashion that they can tolerate failure. In a bigger environment, it is not possible to handle the failure of any component immediately. For example A Distributed Systems 1-51 Introduction server is called on the internet and it is not available immediately, the client is responded with the corresponding message. The client is not in a waiting, state but completely free to do some other task. ¢ Detecting failures It is possible to detect some failures on the system. In a proper coding, some error messages may be embedded which are executed during the respective conditions. But in a distributed environment it is a challenge to detect the failures. For example if a remote server crashes a failure occurs, then it is difficult to detect the circumstances at that moment of time. ¢ Recovery from failure In some cases, it is possible to recover the data. Recovery means getting the system or the process to its original state after a crash or failure. It is a challenge to recover failures in the distributed system. The software must be designed in such a way. That after a crash, the system automatically comes back to its original state. © Masking failures Some failures after detection can be made hidden. For example, if a data is sent from one process to another and it is lost during reception, it can be retransmitted again. * Redundancy The failures could be tolerated and the services could be made available to the users by adding some redundant components in the system. For example, * There could always be one backup server. * Data can be replicated on various servers. If there is fault on one server, immediately a replacement could be done. In a distributed environment, if one of the components fail, only the work that is involving a failed component is affect. Rest remains unaffected and the working continues. ©) Security Model We have discussed the security aspect in section 1.7.1. The security of a distributed system can be achieved by a) Securing processes of the distributed systems. b) Securing channels of the distributed systems. ©) Protecting the objects against unauthorized access. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Communication This chapter includes Interprocess communication : Message oriented communication, stream oriented! communication. Layered protocols : Lower level, transport level and higher level protocols. Distributed objects : RPC and LPRC, remote method invocation, events and notification. 2.1 interprocess Communication In a distributed system, the interprocess communication is an important issue to be handled. This is so, because in the distributed system, the processes or tasks needs to communicate with each other. In a distributed computing usually there is no shared memory, so there is a need for message passing mechanisms. For example : Fig. 2.1 Fig. 2.1 shows that process X sends a message to process Y. Process Y receives it. The send and receive of message may be synchronous. ie. process X waits till process Y receives the message. The send and receive of messages may be asynchronous. ie. process X proceeds with its further computation after sending the message without waiting for process Y to receive it. (2-4) 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 2-5 Communication Fig. 2.6 Message is only sent if y is running. iv) Receipt based transient synchronous communication In this method, the sender is blocked till the message is stored in the local buffer of the receiving, host. ‘Sends request and waits till received Running and receives the request Fig. 2.7 ¥) Delivery based transient synchronous communication In this method, the sender is blocked till the message is delivered to the receiver for further processing. ‘Sends request and waits until accepted Fig. 2.8 vi) Response based transient synchronous communication In this method, the sender is blocked till it receives a reply from the receiver that message has been received. By this time the processing of the request might have begun at the receiving end. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 2-9 Communication Features of message queuing system ¢ Sender is only given the guarantee that message will be inserted in receivers queue. * There is no guarantee of reading the message from queue. The above semantics lead to loosely coupled communication. This leads to four combinations. i) The sender and receive both execute during the entire transmission. i) The sender may be executing, but the receiver is passive. iii) The receiver may be executing, but the sender is passive. iv) Both sender and receiver are passive. Following primitives are used by the message queuing model. * Put : This primitive is called by a sender. This is done to pass a message to the system. * Get : This primitive is a blocking call. By this primitive, the longest pending message in the specified queue can be removed by the authorized process. * Poll : By this primitive, if a queue is empty or if the specified message is not found, then too the calling process continues. * Notify : This installs a handler, This handler is to be called when the message is put into the specified queue. Architecture of message queuing system The architecture of message queuing system consists of following components. * Source queue : The queues that are local to the sender are called as source queues. The messages can be put only into source queues. * Destination queue : The message in the local queue has a specification of where the message is to be transferred. ie. it has the specification of the destination queue. * Queue names : Since various queues are distributed across machines, the message queuing system should maintain a database of queue names. * Queue managers : The managers manage the queues. They also interact with the sending and receiving application. Queue managers operate as routers or relays. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 2-13 Communication 2.3.1 Quality of Stream The quality of service of the streams can be expressed in many ways. For example : By providing flow specification. The flow specification may consists of various parameters to be specified. These parameters may be © Bandwidth requirements + Loss sensitivity + Transmission rates ° Burst loss sensitivity * Delays : minimum delay noticed, maximum delay variations, quality of guarantee etc. Token bucket algorithm specifying the stream related to network traffic is used in partridges model. The important features of the algorithm are 1) Token is an fixed data size in bytes that can be sent through the network. 2) The tokens should be generated at a constant rate. 3) A limited capacity buffer is providéd. The tokens are buffered through this bucket. 4) If the bucket is full and the data is to transferred through the network, some of the tokens are being dropped fromm the bucket. 5) If M bytes are to be transferred, then tokens of total size of M bytes are to be removed, so that the space is created in the bucket. If one token is of n bytes, then M/n token are removed to make M bytes free in the bucket. Fig. 2.14 Token bucket algorithm The quality of service requirements for a data stream are met by the distributed system by allocating appropriate resources for setting up a stream. For streams, the resources required may be * Buffers 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 2-17 ‘Communication * Eight bits at a time over 8 separate wires are sent and received by a parallel port. Thus data is transferred very quickly. But it has a disadvantage that the cable required is more bulky. This is because of the number of individual wires it contains. Application of parallel ports : Used to connect a PC to a printer and are rarely used for much else. © One bit at a time over one wire is sent and received by a serial port Advantage is that only a few wires are required. In fact, two-way (full duplex) communications is possible with only three separate wires - one to send, one to receive, and a common signal ground wire. Disadvantage is time required is more as compared to parallel communication. ‘The serial port is a full-duplex device meaning that it can send and receive data at the same time. It uses separate lines for transmitting and receiving data. Communication The transmitter sends the actual data bits after the start bit has been sent. The number of data bits to be transferred is varied and both receiver and the transmitter must agree on the number and the baud rate. Mostly the devices transmit 7 or 8 data bits at a time. A stop bit is sent after the data has been transmitted. A stop bit may have a value of 1 of a mark state. Even if the previous data bit has a value of 1, the stop bit can be detected correctly by the stop bit's duration, A small amount of error checking, to help detect data corruption that might occur during transmission is possible using a parity bit. A parity bit may optionally be transmitted along with the data. The parity can be * Even parity © Odd parity * Mark parity * Space parity or none at all. The number of marks (logical 1 bits) in each data byte are counted, and a single bit is transmitted following the data bits to indicate whether the number of 1 bits just sent is even or odd. This is done in case of odd or even parity | RS-232 stands for Recommend Standard number 232. C is the latest revision of the RS-232 standard. A subset of the RS-232C standard is being used by the serial ports 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 2-21 Communication The Binary synchronous communication (BSC) character-oriented data link layer protocol is used for half-duplex applications. BSC is an IBM link protocol. This protocol was announced in 1967. In BSC the common link management rules could be used with three different alphabets for encoding messages. USASCII with 128 characters and EBCDIC with 256 characters looked forward. Character Orientated Protocols (COP) In the character oriented protocols, every character has significance. The character has two meanings when it arrives at the receiver. It can be either + adata byte * acontrol byte This byte is used as information signals between the sender and receiver. The main COP in use today is known as BI-SYNC, or binary synchronous. Each character sent is transmitted using the ASCII code. Communication takes the form of a hand-shake between the sender and receiver. Communication of a message from sender to receiver takes the following format. + ------ OK send next data nas 2 Sends next ===» Fig. 2.18 Communication using character oriented protocols 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 2-25 Communication Two DTE devices: communicates with each other in the network layer. This communication is through packets. The passage of information between these devices is dependent on X.25. The device playing the role of DTE is called a packet assembler/disassembler (PAD). Fig. 2.21 Thus limited network capacity is intelligently shared by two users with the help of X.25 PLP. This protocol allows the users with the following facilities. ¢ To communicate with remote DTE devices. « To communicate with remote end users. * To own a piece of network capacity temporarily. This network capacity is of determined type and quality. * To recover from network errors. The main feature of X.25 is DTE multiplexing : Virtual Circuits (VC) There are two types of virtual circuits 1) Permanent VC (PVC) 2) Switched VC (SVC) X.25 provides VC initialization and termination. PVC is available to DTE as a permanent resource. SVC is established every time a resource is required. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 2-29 Communication DTE/DCE reset confirmation : This packet flows from DTE towards DCE. It informs DTE or DCE that sequence numbering has been initialized. DTE restart request : This packet flows from DTE towards DCE. It requests for clearing of all calls on the interface. DTE restart indication : The flow of this packet is from DCE towards DTE. It informs DTE that network will be clearing all the calls on this interface. Diagnostic : The flow of this packet is from DCE towards DTE. It provides network diagnostic information typically following error recovery. DTE interrupt : This packet flows from DTE towards DCE. DCE interrupt : This packet flows from DCE towards DTE. DTE interrupt confirmation : The flow of this packet is from DTE towards DCE. This response is from DTE. DCE interrupt confirmation : The flow of this packet is from DCE towards DTE. This response is from DCE. DTE data : This packet flows from DTE towards DCE. It is the user data or higher-layer protocol from the source DTE. DCE data : This packet flows from DCE towards DTE. It specifies the user data or higher-layer protocol passed from the remote DTE by way of the local DCE. DTE RR : This packet flows from DTE towards DCE. It specifies the acknowledgement of data packets and permission to send more from DCE. DCE RR : This packet flows from DCE towards DTE. It specifies the acknowledgement of data packets and permission to send there from DCE. DTE RNR : This flow is from DTE towards DCE. It represents the DTE request for DCE to shut off flow of data packets. DCE RNR : This flow is from DCE towards DTE. It represents the DCE request for DTE to shut off flow of data packets. DTE reject : This packet flows from DTE towards DCE. It specifies the request to the DCE to retransmit data packets beginning from a specified sequence number. Registration : This packet flows from DTE towards DCE. It specifies the request to the network from the user to subscribe to specified network capabilities. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 2-33 Communication synchronise their dialogue. They are also presented to manage their data exchange. The session layer provides services to establish a session-connection between two presentation entities. Interactions are possible after the connection is established. Basically there are two types of sessions : * Physical session : A physical connection may be a call or a connection through a network. * Logical session : Communication between two application-level processes is done by means of "sessions". The sessions perceive as one logical interaction or a series of such interactions in parallel. One logical session may map onto a number of physical connections or a number of logical sessions can map onto a single physical connection. Functions of the session layer * Connection establishment phase * Data transfer phase * Connection release phase © Facilities negotiated which both ends support © Expedited data exchange session connection synchronisation * Orderly release or abort. Session protocols include ISO 8327, CCITT X.25, ECMA 75 and CCITT T.62 which is intended for use in teletex services. We will elaborate the ISO-SP session layer protocol in this section. 1SO-SP : OSI session layer protocol (X.225, ISO 8327) This layer protocol is responsible for * Session management : It manages for opening and closing, of the sessions. ¢ Recovers connection : If there is a connection loss it tries to recover the connection. © Closes the connection ; If a connection is not used for a longer period, the session layer may close it down and re-open it for next use. ¢ It provides synchronization points in the stream of exchanged packets. ‘The protocols specified in the session layer are carried out by a machine called as the Session Protocol Machine (SPM). This machine also communicates with the Session Service user (SS-user). The communication is through an Session-Service-Access-Point (SSAP). Certain service primitives are provided to carry out the services. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems: 2-37 Communication 3) TELNET An application can be connected to the workstation by a terminal emulation program called as TELNET. This communication is done using a TCP/IP link. Features of telnet «It is a client/server application. The server runs on a host. Applications run on the server. + The server passes information between the applications and the Telnet clients. « The well-known port number for Telnet servers is TCP port 23. * Telnet clients must convert the user data between the form in which it is transmitted and the form in which it is displayed. Since a TCP protocol is used in Telnet, neither the client nor the server need be concerned about re-sending data that is lost, nor about error checking, Advantages of Telnet 1, Telnet protocol is very simple. 2. No special format for TCP segments is required that contain commands. A TCP packet is normally a 7 bit ASCII value. The principle Telnet commands are + * WILL : Sender wants to enable the display option. « WONT : Sender wants to disable the display option. * DO: Sender wants the receiver to enable the display option. * DONT: Sender wants the receiver to disable the display option. There are many more application layer protocols: like Finger, FIP (File Transfer Protocol), HTTP (Hyper Text Markup language) etc. 2.5 Remote Procedure Calls Message passing is the basic communication media used in distributed systems. RPC : When a process on machine X calls a process on:machine Y, the calling process on machine X is suspended. The procedure which is called now executed on machine Y. The results of this process are returned to process X and then the suspended process resumes. This method is called as an RPC (Remote Procedure Call). 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 2-4 Communication 2.6.1 Types of Objects a) Distributed objects Objects encapsulates * Data and * Operations on data The data is called the state and the operation on the data are called methods. The methods in the objects are invoked to access the object. This invocation is done through the interface. There may be various objects in an interface. So also there may be various interfaces to which an object implements. In distributed systems, the separation of interfaces and the objects implementing these interfaces is an important issue to be handled. The separation leads to a method that places Interface on one machine and * Object on another machine. This type of organization is called as a distributed object. Server machine Fig. 2.29 Organization of distributed object An implementation of objects interface is loaded to the client machine. This interface is called as proxy. The only work proxy has to do marshal method invocations to messages and unmarshal all the reply messages and return them as a result of method invocation to the client. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 2-45 Communication In this method first the client bounds to an object and then it can invoke the methods of an object through a proxy. In static RMI, the interface of an object should be known when the client application is developed. The client application is to be recompiled if the interface changes. Normally an interface is defined in the interface definition language. Dynamic RMI When the method invocation is done at runtime, it is called as dynamic RMI. Examples of RMI : In this section we will see two examples of RMI. 1) DCE RMI and 2) Java RMI 1) DCE RMI In DCE, the distributed objects are specified in the interface definition language. The objects also have C++ bindings. This is also specified in IDL. In DCE the server creates C++ objects locally. It then makes the methods available to remote clients. DCE supports two types of distributed objects. © Distributed dynamic object : This object is created by the server for a specific client and it is only accessible to this specific client. To create this object, the client sends a request to the server. * Distributed named objects : These objects are created by the server and they are not designed for a specific client. They are shared by several clients. The named objects are registered in the directory. The client looks up for the object and then request for it to the server. The request is done using the unique identifier assigned to the named object. Remote invocation in DCE is done using RPC. a) Client invokes a method of an object. b) It passes * Object identifier * Interface identifier and © Method identifier to the server, 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 2-49 Communication Object of interest : When the operations in the object are invoked, the object changes its state. Other objects may be interested in this change of state. If this object transmits notifications it becomes a part of event service. . al b) 9 dy Event : The completion of a method execution results in an occurance of event. Notification : Information about an object is kept in this object. [t has information regarding type of the event and its attributes like Identity of the object. Method invoked. Time of occurence. Sequence number etc. Subscriber : An object that subscribes to some types of events in another object is called a subscribers. It also receives notifications about such events. Observer objects : The observer decouples an object of interest from its subscriber. The main roles of observer are : Forwarding : The work of sending notifications to subscribers on behalf of one or more objects of interest is done by forwarding observer. Filtering of notification : An observer applies the filters. It does this to reduce the number of notifications which are received according to some predicate on the contents of each notification. Patterns of events : Patterns of events can be specified by the objects of interest in its subscription. Notification mailboxes : Ar observer may take the role of mailbox, that receives notification on behalf of a subscriber. Publisher objects : This object declares that it will generate notifications of particular types of events. The architecture had shown 3 cases. ) 2 3) Object of interest inside event service without an observer. Object of interest inside the event service with an observer. An object outside the event service. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 3-3 Distributed File Systems Characteristics of file systems (Non-distributed) Files are stored on disks or other storage media. It is the responsibility of the file system to organize, store, share, name, retrieve and protect the files. © Organize and store A file consists of data and attributes. The data items are read and written mostly in a sequential fashion. The attributes comprises of file length, creation timestamp, read timestamp, write timestamp, attribute timestamp, reference count, owner, file type, access control list etc. Most of the attributes like file length, creation timestamp etc. are managed by the file systems. The users are not allowed to update the attributes. * Naming The\file system’ supports naming of the files with the help of directories. The directory module maps the text names of files to internal file identifiers. The directories may. include subdirectories and files. A hierarichical model is followed. * Control access The file system takes care of control access. Unauthorized users are not allowed to access the files. File system operations are - * Read a file. * Write the contents on to a file, © Update the file. * Copy a file. Transfer n bytes of data from a file. © Open a file. © Close a file. * Move file pointer to required position. © Get the file attributes. 3.2 Distributed File Systems The main idea of a distributed file system is to help in sharing of information in the distributed environment. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 3-7 Distributed File Systems client copies the file to its local disk and performs the required operations on the file. After its use, the file is returned back to the server machine. The implementation details of the file are with the client machine. The server has two versions of files available with it, the old fije which was requested by the client and the new file which is being modified by the client. Since the file is downloaded by the client before carrying out any operations on it and uploaded back to server after carrying out various operations, this type of model is called as upload/download model. Though NFS has been implemented for many operating systems, UNIX based version is the most popular one. Fig. 3.4 shows the basic architecture of SUN NFS. NFS protocol Fig. 3.4 NFS architecture for UNIX system 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 3-11 Distributed File Systems. Fig. 3.7 shows the diagram consisting of 2 servers and one client. In the client there are 2 file systems mounted. The file system /user/employee is actually the subtree located in server 1's person directory. The file system mounted at/user/management is actually the subtree located at /NFS/users in server 2. NFS allows the client to mount only part of a file instead of mounting an entire file system. There may be directories which are exported by various servers. The NFS server can itself mount such directories. But such directories cannot be exported to its clients. ) Synchronization in NFS Sharing is an important aspect of any distributed system. There is no distributed system without sharing. Files are normally shared. In a single processor system, a file is opened, modified and stored back with consistent data. This consistency is not achieved in the sharing concept. When two clients are accessing the same file from the server, there are variety of ways in which the results are evaluated. * Process x reads file a from the server, makes the changes and sends it back. © Process y reads the same file after a certain delay. It gets the modified version of the file. ‘This was the case when successive reads had a time span between them. * Process x reads file a from the server, have stored a copy of the file in its local cache and the processing is going on. * Meanwhile process y reads file a from the server, mounts it to its local cache. Now, the process y gets the old contents of the file. The changes are visible only when the file is closed. When process x closes the file, subsequent reads of the same file gets the new data. This concept is called as session semantics. When there is concurrent access as in the case 2 above, the final result depends on the close request performed by the server. If the server performs the close request of process x first, then the changed data in the file is due to the, processing of process x. Transactions, is also another way to deal with concurrent file access. f) Caching in NFS ‘The SUN NFS system uses the concept of caching the data, There are two types of caching Client caching © Server caching 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 3-15 Distributed File Systems Fig. 3.12 Duplicate request cache In such type of communication, three types of situations may occur. Situation 1: i) Client sends a request while sending it starts the timer. ii) The timer may go off before the reply comes in. If it does not, then the correct transaction has taken place. If the timer goes off before the reply comes in, client retransmits the request. The server receives the second request but since it has not completed the previous transaction it will ignore the case. trans id 124 timer off r | Ignores the j hs se} ———— | __,) transaction a wena 124 axon, Fig. 3.13 Situation 2: i) The retransmitted message is received by the server at the same time when it had responded back. The server still ignores the retransmission request and it might be the same one and the response had already been sent to client. Situation 3: i) The reply sent by the server may actually be lost and may be retransmitted back with the cached results. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 3-19 Distributed File Systems. Fig. 3.18 Internal organization of a workstation 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 3-23 Distributed File Systems. }+—22 bits Rvid +} 64 bits handle ———————+} Fig. 3.21 96 bit identifier of a CODA file When a file is to be located, the Rvid of a file identifier is sent by the client to the volume replication database. In turn this replication database sends a list of Vids that are associated with the send Rvid. The client then looks for the server that is having the current replica by passing Vid to volume location database. It returns the actual physical volume. Following Fig. 3.22 shows the procedure. Sends Rvid of the Stent, file needed Volume | Retums specific location location database —__ Fig. 3.22 CODA file implementation e) Synchronization in CODA file system CODA supports transaction semantics. In a large distributed system, there may be failure in the servers and hence the required file may be unavailable. The CODA file system searches for its replica and allows the client to modify the same. The modification is replicated back to the server on recovery from the failure. This is possible because the clients have files that are cached locally. File sharing in CODA FS is somewhat similar to NFS. Suppose client X requests for a read operation on file A. The server sends the file to the client and keeps a record of this. Meanwhile, if client Y requests for the same file, the server again sends a copy of the file to client Y. ‘This was possible because client X had a read operation. This is shown in Fig.3.23 below. If the client X had a write operation, then the copy of the file would not be sent to client ¥ as the server had recorded the client and the file while transmitting. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 3-27 Distributed File Systems h) Security in CODA file system ‘The security aspects of CODA file system can be classified in two parts a) Security of the channel between client and the server. b) Providing control access to files. The client and server communicates through a secure RPC mechanism in a CODA file system. If a client wants to communicate with the server, it sends its identity along with a challenging, message to the server. The challenging message is in the encrypted form. The client also sends an encryption key. The server accepts the challenge, decrypts the message with the secret key that is shared by the client and the server and thus proves its identity. Now the server sends an challenging message along with its identity and the secret key. The client should decrypt the message and prove its identity. Once the authentication is proved the server sends a session key which is used for further communication. Similar security channel is maintained between venus client and vice server of the ied (Server identity, key. message in encrypted form) CODA file system. semen (Ee ‘Sends a session key Fig. 3.28 RPC2 mutual authentication i) Access control Access control in CODA file system is achieved by various access control operations some of them are - * Administer : This modifies the ACL of the directory. © \Lookup : It shows the lookup status of any file. © Write ; Modify any file with the write operation in the directory. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 3-31 Distributed File Systems d) Synchronization Plan 9 implements the same semantics as that of UNIX file sharing. When the client opens, the file, it is sent from the server. The client has the copy of the file in its local cache. The same copy of the file may also be needed and loaded by various other clients. All the clients are modifying the same file. The client which closes the file first and intimates the changes or modifications to the server actually changes the file. The changes done by other clients are not countable. ) Caching and Replication In plan 9 there is minimal support for caching and replication. f) Security Secure channels are maintained and the user authenticated in the same way as the secure channels in CODA file systems, This is done by sending secret keys. If a client wants to communicate with the server, it sends its identity along with a challenging, message to the server. The challenging message is in the encrypted form The clients also sends an encryption key. The server accepts the challenge, decrypts the message with the secret key that is shared by the client and the server and thus proves its identity. Now the server sends a challenging message along with its identity and the secret key. The client should decrypt the message and prove its identity. Once the authentication is proved, the server sends a session key which is used for further communication. (Client identity, key, encrypted message) (Server identity, key, encrypted message) Authentication done ‘Sends a session key Fig. 3.31 Plan 9 Mutual authentication 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 3-35 Distributed File Systems The client asks for public key from SFS server to know its authentication. It then computes the HID. This HID is compared with the value found in the path. If it matches, the client knows that it has established communication with the user. The server also has three components © Server of NFS : This component communicates with the server of SFS. «Server of SES : It handles file requests from the clients of SFS * Authentication server : It handles the user authentication issue. d) Security File system security issue is separated from user authentication issue in SFS. The method explained above in the naming section is used by the client and server to authenticate each other. Transperency is achieved by using symbolic links. User or hos id hidden by creating symbolic links. iti) XFS Serverless file system. XFS is a serverless file system. Since there is no server where the file system should exist is an issue of concern. So the entire file system is distributed across multiple machines. These machines include the clients also. In LAN the machines are connected three links. XPS was designed to operate on the local area network. Since the entire file system was distributed a higher amount of scalability and fault tolerance was achieved as compared to the traditional file system. XFS architecture consists of three components. i) A storage server ii) A metadata manager iii) A client. i) A storage server A process that stores parts of the file is called a storage server. il) A metadata manager | The process which keeps the track of actual storage of file data block is called a |metadata manager. Since the data blocks may be scattered over multiple storage servers, the metadata manager sends a request from clients to the respective storage server. Ui) A client In XFS, a client is a process that accepts user requests to operate on files. Any machine can play the role of client server and manager. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 3-39 Distributed File Systems 6) The client uses the segment identifier and offset to find the exact server and passes it to the server. Retums stripe group identifier Returns ist of servers Fig. 3.38 Procedure for reading a block of data in XFS c) Naming No special naming procedure is adopted in XFS. Only certain identifiers are used which are already explained in the above topic. Various components are Stripe group map File directory File identifier imap inode Manager map Log address 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 3 43 Distributed File Systems During migration the objects dyna lly move along various nodes in the system. When the location of the object changes, its name should not change. This property is called as location independency. The location independent calls must have following features. * Knowledge of the physical location must not be required during the access of the object. * An access request can be made by any object without knowing its physical location 4) Group naming Many different objects can be identified by the same name in a good naming system. This facility is used to group the objects for a particular application. © — Uniform naming convention A uniform naming convention must be used for all types of objects in a good naming system. f) Multiple user defined names for the same object A good naming system must allow multiple user defined names for the same objects. g) Meaningful names Meaningful names related to the functionality of the object must be assigned to the objects. Meaningful names are easy to remember and also to use. So a good naming system must provide meaningful names to their objects. h) Performance The amount of time required to map an object’s name to its attributes like location indicates the performance of the objects. The number of messages exchanged in the name mapping operations are more as compared to the performance. A good naming system must have less number of messages to be exchanged. i) Replication transparency A good naming system should support the use of multiple copies of the same object. This must be done in user-transparent manner. Replicas enhances © Systems performance and © System reliability. j) Fault tolerance A good naming system should continue functioning in case of failure. At least faults should be tolerated to some extent. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems ‘7 Distributed File Systems. Example : In SSP chains, (proxy, skeleton) pair is used to represent the forwarding pointer. In this a proxy is a stub and skeleton is a scion. Thus (stub, scion) pair is used. Server side stub is skeleton. It may contain * Local reference to actual object or * Local reference to proxy. For remote references Askeletons —+ entry items and Proxies — exit items “Lally | Proxy Skeleton Fig. 3.43 When an object moves from address space X to Y, it installs proxy (ie. client side stub) at X and skeleton (server side stub) at Y. ‘The client's request is thus forwarded across the chain towards the actual object. Disadvantage : Scalability problems are imy ©) Home based approach : A home location keeps the track of current location of an entity. Mobile Ip uses home based approach. Fixed IP address is used by each host. Each host has an home agent through which it communicates. The home agent receives a packet for mobile host. It then checks the current location of the host, if it is on the local network, host receives the message. The host's current location is informed to the sender of the message. Drawbacks : * Communication latency is increased as the home agent is the intermediate source from where transfers take place. * This approach used a fixed home location. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 3-55 Distributed File Systems The DNS naming data are divided into zones. Each zone has a) Attribute data for names in a domain. b) The names and addresses of at least two name servers. These servers provide authoritive data to the zone. ©) The names and addresses of servers which hold the data for subdomains. d) Zone management parameters like caching and replication of zone data ‘The DNS database is a collection of files. The most important file contains all the records of all the nodes in a particular zone. * DNS implementation achieves short response time for lookups. It does this by : a) Partitioning naming data b) Replicating naming, data ©) Caching naming data. © Naming data can be inconsistent in DNS. DNS assumes one mail service per addressed domain. Users need not include the name of this service explicitly in names. Disadvantages * DNS stores limited variety of naming data. * It is rigid to the structure of name space. * The name space cannot be customized to meet the local needs. 2) Case Study of Global Name Service GNS The GNS was designed to provide facilities for ¢ Resource location. * Mailing address and * Authentication. Design issues of GNS + The naming support must support a naming database having name of millions of computers and email addresses of billions of users. © The structure of the name space may change during the evolutionary growth of the network. «The services provided must accomodate the above change. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 3-59 Distributed File Systems * Each part of the partitioned directory is called as a zone. DSA implements various directory services and it is like a normal server. Fig. 3.57 DUA exchanges information or communicates with DSA using standard protocols. Searching through DIB Suppose we are interested in listing all main servers of KKWIEER. The search operation is answer search (“& (C (CN = Mainserver)”) In this example we are looking for country India, in which we search the organization KKWIEER, we are not interested in the organization unit so there is a * in the operation. Thus each result that is retuned has the CN attributes as main server. Question Bank 3 1. List three advantages that a file service provides in a distributed system, compared to only using the local file system on each host. 2. In terms of distributed file system requirements, briefly explain each of the following : a) Access transparency ) Location transparency ©) Mobility transparency. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. tributed Systems 4 Synchronization » Universal Coordinated Tit e (UTC) and leap seconds CTU stands for Coordinated Universal Time and is set from atomic clocks. CTU is actually abbreviated as UTC. If is an international standard for timekeeping. UTC began in 1961. UTC is a discontinuous time scale composed from segments that are linear transformations of atomic time. UTC (Universal Coordinated Time) is corrected TAL Timing signals can be broadcast via radio signals that are set to UTC devices as can satellite GPS systems. Computers with the appropriate (and expensive) receivers attached can synchronize their clocks with UTC. Receivers are available commercially. Accuracy varies with weather conditions. UTC signals take time to propagate. Now let us see why clock synchronization is important in distributed environment ? 4.3 Clock Synchronization Synchronization between processes is critical in a distributed environment. Consider example of airline booking. If there are two transactions for reservation requests each may be satisfied depending on the availability of seats. If there are not enough seats left for both, then the transaction with the earliest timestamp wins and seat will be reserved for that request. Here we don't need fine-grained accuracy, what we need is a timestamp ordering convention. If there is tie like equal timestamps then IP address/process ID can be used. Consider example of UNIX make. Suppose a make (compile and link) involves many components, which are edited on distributed computers. If a component is, edited immediately after a make but on a computer with slow clock and the edited source is given a timestamp earlier than the make on the next make, this component is not recompiled Accurate clocks are atomic oscillators (one part in 1013). Most clocks like mechanical watched are less accurate. Computers use crystal-based blocks (one part in million). But Crystal-based clocks are less accurate. For crystal-based clocks drift is about 106 seconds/second giving a difference of 1 second every 1,000,000 seconds (or 11.6 days). The drift rate of a “high precision" quartz clock is about 107 or 108 seconds / second, Computer clocks tend not to be in perfect agreement. Skew or Clock Drift is a factor. Clocks tick at different rates is Clock Drift. Difference between two clocks at ‘one point in time is called Clock Skew. In some systems like real time systems, the actual clock time is important, It is some times necessary to synchronize clocks with real-world clocks, As clocks never run at the same rate, there is need to synchronize physical clocks with each other. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 4-7 Synchronization Every machine broadcasts the current time when each interval starts. But these broadcast will not be precisely simultaneous as speed of clock on different machine may be different. Machine starts its local timer after broadcasting and collects other arriving broadcasts in interval, I. It then computes a new time after receiving broadcasts from all others. These are variation for computing the time in the simplest algorithm of averaging all the values it receives. In first variation n highest and n lowest values are discarded and then average of values is taken. This is done because there may have faulty clocks sending extreme values. Second variation done is estimate of propagation time from source is added to correct the message. NTP (Network Time Protocol) is described in Mills (1992). It achieves accuracy in the range of 1-50 msec. Use of synchronized clocks Now-a-days the clock synchronization to within few milliseconds is possible because of new technology. The software and hardware is readily available. Let us take example of enforcing at most once message delivery to a server even in case of failure Traditional approach Each message has unique number assigned to it. Each server have to store all member of the message it received. These are used to detect new message from retransmission. But in between if server crashes then the message numbers are lost. Also it is necessary to decide period for which this must be saved. Time approach Each message carries a connection identifier that is chosen by sender along with timestamp. Server stores the most recent timestamp it has received in a table. This recording of recent timestamp is for each connection when there is any incoming message whose timestamp is lower than the stored timestamp for that connection, this is considered as duplication and is rejected. Each server maintains a global variable continuously for removing the old timestamp. The value of the global variable is calculated by following formula (Global variable) G = Current time - Maximum life time ~ Maximum clock skew. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 4-11 Synchronization 4.5 Global State Local state of each of the process in the distributed system along with the ages which are in transit not yet delivered define the global state of the distributed system. Many times it is useful to know about the global state of the distributed system. me: There are many occasions and reasons when global state of distributed system must be known. The occasion when there are no messages in transit but local computation is stopped this means that progress is stopped so from global state it may be concluded that the system might be in deadlock or it is also possible that termination might be after distributed computation. Global state is formed using local state but what is local state of the process ? Local state may be records which are part of database in distributed database system. Chandy and Lamport introduced distributed snapshot. Distributed snapshot is simple, straightforward way to record global state. It reflects consistent global state from distributed snapshot the state in which the distributed system can be judged. The process P, sends message to P, and this is recorded. The message received by P; may not have recorded. But if message received by P, is recorded then by whom that message is sent must be recorded. The graphical representation of notion of global state is cut. Let there be 4 proce: {Pi, Pz, Ps,Pa} and 4 messages are sent {m;, m2, m3, m4} The consistent and inconsistent cut for there processes and messages are shown in Fig. 4.9. Consistent cut Fig. 4.9 Consistent cut 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 4-15 Synchronization 4.6 Election Algorithm The functions like replacing token controlling 1/0 device in the system, enforcing mutual exclusion are performed by a process called co-ordinator. The processes require one process to act as a co-ordinator to perform these functions. Election algorithm elects co-ordinator from processes. We will discuss two election algorithms in this section. 1) The Bully algorithm Ul) A ring algorithm 4.6.1 Assumptions of Election Algorithm 1) Unique number is assigned to each process. This will help to distinguish the process from all other processes having same characteristic and are exactly same. So for this we assume that there is one to one correspondence between processes and machine. As co-ordinator has highest number, election algorithm try to locate process with highest number. Il) Every process knows process number of every other processes. 4.6.2 The Bully Algorithm This algorithm is devised by Gracia-Molina. If process P; sends a request but co-ordinator has not responded within a time interval T then assuming co-ordinator has failed the process intiates election Following steps are followed by process P, a) P; sends an election message to every process with a higher number. [t waits for a time interval T for response from all other processes. b) If no other process responds within time interval T then P; wins election as there is no other process having number greater than P,. If there is any other process having number greater than P, that process takes over and job of process P; is over. ©) The process which wins election restarts a new copy of co-ordinator and sends message that it is new co-ordinator. This message is sent to every other process having process number less than its own Process may get the election message from any other process even having lower number, When process gets election message from lower numbered process it sends message to sender about being alive and then it takes over. Only higher number process wins and restarts new copy of co-ordinator and send message accordingly. When process which has failed recovers it holds election, If there is no other process with highest number then this process wins if there is any other active co-ordinator with lower number, So the name bully algorithm, 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 4-19 Syn ichronization, In distributed system the activities among the processes must be co-ordinated. The processes may be reading or updating shared data structure or, sharing resources like printer. Here as data structure or resources are shared, mutual exclusion is. necessary. Mutual exclusion, ensures consistency. To achieve mutual exclusion, : the : process executes in critical section. When one process enters in. critical section, no other process will enter in critical section. In system which are not distributed systems the mutual exclusion is achieved using hardware instructions, semaphores, ‘monitors etc we will requite the rhutual exclusion to be achieved based on message passing. The general structure of the process is as shown. Critical section entry Critical section Critical section exit We will discuss the algorithm, which provides different mechanism for critical’ section entry and critical section exit. In this section, we will discuss different’ algorithms for (HPlementing mutual exclusion in distributed system. Ka 4.7.4 Centralized Algorithm Consider a system consisting of n processes and each’ prokess has unique} identification. We discussed various algorithms in a section for electing a unique new} co-ordinator. Using one of the election algorithni the’ co-ordifsitor has bean” aecked” This elected co-ordinator grants permission to other processes, to enter into critical section. When the process wants to enter into critical section it sends request message. The message consists of name of critical section. msg, (name of critical section). This algorithm uses three messages. 1) Request — Request message is sent to co-ordinator to enger in into critical section. 2) Reply ~ If there is no other process in (ite section ‘co-ortinator sent reply message to grant permission. | 3) Release ~ By sending release message process exits critigal'section. ; The process awaits for reply from co-ordinator to permit the.critical section entry. | If there is no other process in the critical section, the co-ordinator grants permission by sending reply message. When the requesting process, sees a ay reply, it enters } into critical section, ae : 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 4-23 Synchronization Let us consider a system consisting of 4 processes {Py, Fi, Pz, P3} to understand the algorithm Suppose that the processes P) and P; want to enter the critical section. Process Py sends a request message to everyone in this case to P,, P),P; and itself with a timestamp 5 and process P; sends request message to everyone, here Pp, Pj, P) and itself with a timestamp 10. The timestamps are obtained from logical clocks which we discussed in section 4.4. Other two processes P, and P, do not want to enter the critical section. This scenario is as shown in Fig. 4.18. Fig. 4.48 (a) Mutual exclusion by distributed algorithm Processes P, and P2 are not in critical section and do not want to enter the critical section so they send reply messages immediately to the processes Py and P; as Py and Py has sent the request message. When process P; receives request message from P) it compares its own request timestamp with the incoming message timestamp of a process Pj. Request timestamp of process P; is greater than the request timestamp of process Py. So P; sends reply message to process Py. When process Py receives request message from Pj, it compares its own request timestamp with request message timestamp of P;. Since the timestamp of process Pp is less than the request timestamp of process Py, it queues the process P; Fig. 4.18 (b) 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 4-27 ‘Synchronization Advantages 1) It guarantees mutual exclusion. 2) Token is circulated in a well defined manner so there is no starvation. Disadvantages 1) It consumes network bandwidth except when the process is in critical section. Eventhough no process want to enter the critical section the token is circulated around the ring. 2) Even if token is lost, then it is difficult to detect it as successive passing of token is unbounded. If token is lost then it is regenerated. 3) If the process fails a new logical ring is to be established. Crash recovery is easier as compared to other algorithms. When process passes token to next process then receiver must acknowledge the receipt of token. If there is no acknowledgement from next process then it might have failed. The new logical ring, 4.8 Transactions Updation of file, using a printer must be done in critical section, Mutual exclusion algorithm ensures that when only one proce: s executing in critical section no other process is allowed to enter in critical section. Similarly transactions also protect shared data. Transaction is a collection of operations, which performs single logical operation. Atomicity is major issue in processing transactions. Transaction ensures that the access and modification of data item are executed atomically. An example is a bank transaction. If a person has two accounts in a bank. If a person withdraws amount from one account and deposits it in another account then it must be done in single atomic operation. It is necessary that either both the withdraw and deposit occur or neither occur. 4.8.1 Transaction Model We can consider transaction as program unit consisting of following, operations 1) Begin 2) Read 3) Write 4) Commit 5) Abort It is simply sequence of begin, read, write and terminated by either commit or abort. If transaction is successful then it terminates with commit and if it is unsuccessful then transaction aborts and restores the old values. Aborted transaction must not affect the system state. Hence transaction is rolled back state there data items must be same as it was just before the transaction started. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 4-31 Synchronization toplevel transaction then the private workspace is the file system. For subtransaction private workspace has pointer to parents workspace. When process wants to read a file the pointer in the private workspace is followed until the file is located. Consider transaction where the file is opened for writing. The private workspace is assigned to a process. As per second optimization techniques (as discussed above) only the index is copied to private workspace. Whenever file block is first modified, the copy of that block is maintained and address of that block is added in index. Whenever a block is appended, address of the block is inserted in index. This is shown in Fig. 4.22 Consider a file having 2 blocks. So private workspace contains 0, 1 as file index. Fig. 4.22 (a) Private workspace If transaction modifies block 1 so copy of block 1 is maintained as shown in Fig. 4.22 (b) Fig. 4.22 (b) Modifies block If transaction append one block then index is inserted. Private workspace Fig. 4.22 (c) Index inserted 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 4-35 Synchronization Read (x) Write (x) Read (y) Write (y) Transaction when concurrently executed in some order result may be same as serial schedule. This is serial equivalence where instructions in transaction interleaved producing the same effect same as serial schedule. Consider above transactions Ty and T;. These transactions are scheduled as shown here. Schedule 4 Schedule 2 Read . x | | Reads % wena (0) Read (x) => T Read (|_| | Whe (x) = % Write (9) Write (x) => % Read (y)| % Read (y) => T Write (y) Read (y) => 7 Read (y)| Wiite (y) => % write (y)} Schedule 3 ‘Schedule 4 Read (x) Read (x) write (x)! = % | | wrie of > © Read ead (y)} ee Read (x) Read (y) > & write 69) > 7 Write (x) => Write (y) = % write (y) => % Read (|_| | Read WO) Write (y)) ~ Write (y) In serial schedule, transaction Tp read and write or modify values of x and y and hen modified values are read by transaction 7; and again changed. Now let us check each of the above schedule. In schedule 1 Ty reads x and write changed value then T, reads modified value of and again modify it. Similarly Ty reads y and modify it and then T; reads this nodified value of y and modify it. So schedule 1 is serial equivalent. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. tributed Systems 4 Synchronization 2) Primary two phase locking 3) Distributed two phase locking 1) Centralized two phase locking Single site we call it as lock manager is incharge of the lock. It grants and releases the lock instead of scheduler. Hence all transaction manager interacts with lock manager. When lock manager grants lock to transaction manager then transaction manager communicates with data manager. When datamanager completes operation it communicates with transaction manager which in turn communicates with lock manager and releases lock. 2) Primary two phase locking Locking is distributed across the multiple sites. Primary copy of each data item is assigned. Instead on single lock manager, each site having primary copy of data item has lock manager. This local lock manager grants and releases locks. 3) Distributed two phase locking It is similar to basic two phase locking, difference is scheduler and data manager resides at each site. General organization is as shown in Fig, 4.27. Functionalities - Begin, Transaction, Read, Write, Commit, Abort Fig. 4.27 General organization C Locks are granted and released on each machine by scheduler. Data may be replicated across the sites. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Fault Tolerance This chapter includes Process resilience Reliable client server communication Reliable group communication Distributed commit and recovery 5.1 Fault Tolerance The failure and fault are key terms in system reliability. Failure expresses the situation to be avoided and the fault explains the problems to be evaded. The fault leads the failure. Consider a computer system controlling the machine by calculating the input. If a calculation is stuck at particular point, that is a fault. If this fault affects the operation of the program such that the computer system calculation causes the damage to the machine, then that is a computer system failure. The ability of a system to function correctly even in the presence of internal faults is called as Fault tolerance. We take a closer look at what it actually means for a distributed system to tolerance faults. The fault tolerance increases the dependability of a system. The system can be trusted to perform the service for which it has been designed is Dependability. Dependability can be decomposed into specific aspects. Reliability Availability Safety Maintainability 1 2. * 4 Security, (5-1) 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 5-5 Fault Tolerance Dependency relations appear in abundance in distributed systems. Based on how a failed component behaves once it has failed, faults can be classified into the following categories L Crash These are also called as fail-stop failures. The component either completely stops operating or never returns to a valid state. This type of fault occurs when a server was working correctly until it prematurely stopped. e.g. An operating system comes to halt, and for which there is only one solution : reboot. H. Omission A server fails to receive incoming messages or to send outgoing messages. The component completely fails to perform its service. There may be receive omission or send omission. In receive omission the server is unaware of incoming message while in send omission the server does its work but fails to send message. Il. Timing It occurs when the component does not complete its service on time. IV. Byzantine These faults are of an arbitrary nature. When Byzantine failures occur, clients should be prepared for worst. e.g. A faulty server may work together with other servers to produce intentionally wrong answers. Byzantine failures are closely related to crash failures. 5.1.2 General Fault Tolerance Procedure Detecting errors Damage confinement Fig. 5.3 Fault tolerance procedure 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 5-13 Fault Tolerance Fig. 5.9 Replicated write protocol There is no single point of failure even if any replica crashes then also it will respond as usual. It tolerate byzantine failures. 5.2.5 Agreement in Faulty Systems If k processes fails, then in worst case all these k processes may intentionally or unintentionally send same reply. The client will believe the voting mechanism. So system will tolerate if out of (2k + 1) only k are lying. Decisions are taken by a group so processes must agree in many cases. In hierarchical group if co-ordinator fails then new co-ordinator is elected and all processes must agree on the co-ordinator. In dynamic group many processes leave and new processes join so all processes must agree on synchronization. In hierarchical group agreement is needed so that the co-ordinator can assign work to workers. When everything (processes, communication) is perfect signing agreement is simple otherwise problems may arise. The main aim in distributed system is agreement must be there on some issue and it must be infinite number of steps or using, minimum protocols. We will discuss agreement considering perfect processes and faulty processes. For faulty processes we will discuss famous ‘Two-army problem’ and ‘Byzantine generals problem’ for faulty processes. 1) Two-army problem Here processes are perfect but communication is not, perfect. As the communication is not perfect it is difficult to reach on the agreement. Two blue armies are on hillside with 2000 troops while the red army is in a valley with 3000 troops. Here blue armies must reach agreement 'about attaching as they will win only if there is perfect co-ordination between blue armies. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 5-17 Fault Tolerance Step 3: General 1 Got (1, y, 3, K) (a,b, ¢ d) (m,n, 0) Step 4: General 1 and 3 are loyal generals but they do not see majority for element 1, 2 or element 3 so all of them are marked as UNKNOWN. Thus they do not reach agreement. Lamport et al proved that, if there are m faulty processes then 2m + 1 processes must function properly to reach agreement. [hus 2/3 of processes must function properly to reach agreement. We may consider looking at this problem as if 2/3 of processes (they may be loyal or traitors) agree then decision correspond to majority of vote by loyal processes. So regardless n faulty processes agreement can be reached. As messages cannot be definitely delivered in known finite time, Fischer et al (1985) proved that agreement cannot be reached even if one process is faulty. 5.3 Reliable Client Server Communication Processes as well as communication channel may fail in distributed system. Communication channel may have crash omission, arbitrary, timing and masking failures. When these channel fails to perform actions that it is supposed to perform omission failure occurs. When a message inserted in an outgoing message buffer never arrives at the incoming message buffer of receiver then it is omission failure. Worst possible errors are arbitrary failures. Corrupted message contents of delivering non existent messages, duplicating messages are some examples of arbitrary failures. As communication software can recognize and reject faulty message, arbitrary failure in communication channel is rare. In timing failure, a message transmission takes longer than the started bound. Reliable communication is defined in terms of validity and integrity. The communication is reliable if message from outgoing message buffer is delivered to incoming message buffer (validity) and if there does not exist duplicate message and message received is same as sent message (integrity). §.3.1 Point to Point Communication To establish reliable point to point communication in a distributed system reliable protocol like TCP is used. A sender process is transmit message to transport protocol 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 5-21 Fault Tolerance 3) After completion acknowledgement message and print, server fails. ASPOF 4) Server fails after text printed on plotter but before completion acknowledgement message. P+ F[+A] 5) A server fails after completion acknowledgement message but before printing text on plotter. A+F [+P] 6) A server fails after printing and sending acknowledgement message. PoASE Events in square brackets never happens. As the client do not know the status of print request. The client follows 4 approaches. 1) Never reissue : By taking risk that the text will not be printed, the client decides to never reissue a request. 2) Always reissue : Client always reissue the request, by this the text may be printed again. 3) Only on receiving an acknowledgement for request is delivered to server, client decides reissue the request. 4) Only on not receiving an acknowledgement for print request, client decides to reissue the request. Let us look whether the text will be printed (PRINT), not printed (NOPRINT) or duplicated (DUPLICATE) as per clients reissue approaches and servers event occurance. Case (1) Server fails before doing anything FOP+A | FIOA>FI NO PRINT NO PRINT PRINT PRINT NO PRINT NO PRINT ‘On receiving acknowledgement PRINT ‘On not receiving acknowledgement 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems §-25 Fault Tolerance TID Agreement : Considering all the processes are correct the process delivers the message m and eventually all other processes in group deliver the message m. Thus reliable multicasting satisfy ‘all or nothing’ property of message delivery. There are some situation which must be handled properly to provide reliable communication. Reliable communication may take place in presence of faulty processes or when all processes are working correctly e.g. If sending processes crashes then there is presence of faulty process and process may join the group during communication. Consider both situations - 1) Reliable multicasting in presence of failure processes If all non-faulty processes receive the message m then it is reliable multicasting. It imposes some constraints which we will discuss in atomic multicasting. 2) Reliable multicasting when all processes are correct Considering there is agreement about the members of a group, this situation is simpler. In this all the members of a group receives the message and sometimes even the ordering of messages are also not needed. Here we assume that the process cannot leave or join the group during communication and also process do not fail. This is easy to implement but it is weaker form of reliable multicasting. The communication system which is used do not provide reliable communication as the message may be delivered to only few processes and may not be delivered to all the processes also it is possible that the message may be lost part way. To solve this problem the sequence number can be assigned to each message. So each message is distinguished using the sequence number relative to sender. The message ordering is assumed as they are sent. Fig. 5.14 Reliable multicasting message transmission 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 5-29 Fault Tolerance Co-ordinator Network ~~ Subgroup Fig. 5.17 Hierarchical feedback control As shown in Fig. 5.17, each subgroup has its own local co-ordinator. The local co-ordinator handles requests of retransmission of receiver in its subgroup. Each local co-ordinator has its own history buffer. If message m is missed by co-ordinator itself then it asks the co-ordinator of parent subgroup for retransmission of message. If co-ordinator receives the message then it sends acknowledgement of receipt to its parent in a acknowledgement based scheme. When co-ordinator receives acknowledgements for message from its children and all member of a group then it removes the message from the history buffer The construction of tree of subgroup is main issue of this scheme. Construction of tree dynamically is also the need of many cases. If there is any multicast tree in existing network then same tree can be used for this scheme. But such adaptation is not easy. Thus if there are large number of receivers spread across the wide area network then it is difficult to build reliable multicast scheme. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 5-33 Fault Tolerance Sender P) multicasts two messages to a group. The communication layer of P, and P; receives first m and then m’ while communication layer of P; receives first m' and then m as there is no constraint on ordering. il) Reliable FIFO ordered multicasts Communication layer must deliver the incoming message in FIFO order that is the messages as they sent from same process and same order. Consider earlier example of 4 processes {Py Pj, Pr, Ps}, Sender-Py Receivers- Py, Pr. Ps Rh aR RB B Sends m Receives m Receives m Receives m Sends m* Receives m Receives m* Receives m’ Let there are Py and P; two senders and PR, and P; two receivers. BR Rh BR By Sends m Receives m Receives Sends m, Receives m, Receives m' Receives m: As shown the incoming messages from same process must be deiivered is a same order as they are sent. P) process sends messages m and m’ in order while P, sends messages m) and m{, in order. According to FIFO ordering the messages m must be delivered before m' and m, before mj. If communication layer receives second message before first sent message it waits until it receives first. Ordering of messages sent by different processes impose no constraints on delivery of messages, as shown in figure P; receives m, first then m then m’ and at end mj. Il) Reliable causally ordered multicasts In this potentia! causally between different message is preserved while delivering the messages. Consider 3 processes {Po,P;, P)}. Pp, Pa are senders and P, is receiver. B [ A Be Sends m | Receive m Sends my’ Receive m' 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed systems 5-37 Fault Tolerance When process P, receives a message m which was sent in G) and it thinks that still the group view is G, then it delivers the message m considering ordering constraints. Process discards the message if the message is already received ' v process and is duplicate message. When P, eventually rectives view change message for G,.; it forwards unstable message and then send flush message. The major problem of this protocol is that it has not handled the process failure during announcement of new group view change. It has assumed here that no process in G,.1 will fail until new view Gj.1 has been installed by every member in G;.:. ‘The solution to this problem is announce the view change even when previous changes are not installed by all processes in a grotip. 5.5 Distributed Gorimit In distributed system it is difficult to ensuré atomicity. The atomicity property requires either all of the processes in a group cothmit the transaction or all ot them abort the transaction, As several sites may be participating in execution of single transaction, if one of the site fails or communication link fails result of operation may be erroneous. The distributed commit is ensured by transaction co-ordinator. Whenever new process joins the transaction, the co-ordinator records it in participants ist. The transaction co-ordinater start the execution of the transaction. It breaks transaction intO subtransactions then distributes {He subttansactions to appropriate sites, It co-ordinate transaction termination either eOmmit or abort. There are three schemes used for distributed commit. 1) One phase commit. 2) Two phase commit. 3) Three phase commit. 5.5.1 One Phase Commit This is simple scheme. To complete a transaction co-ordinator communicates commit or abort request to all the other processés. The one phase commit scheme has drawbacks a8 client does not allow a server to abort transaction when it requests commit, Because of concurrency control server éaft fake Toeal commit. §.5.2 Two Phase Commit Two phase commit protoesl is devised by Gray [1978]. It allows any process to abort its part of transaction. To eftsure atomicity, if one part aborts whole transaction aborts. Consider fi sites {8), Sp)... $, } and m protesses (Pj, Pr... Pm} on these sites, 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems §-41 Fault Tolerance b) Failure of participant Here also we consider 4 cases or states of the participant when the participant X receovers from crash examines the log. Log is maintained on persistent storage by each participant for each event. The log decides the fate of the transaction which was in the midest of execution when participant crashed. 1) Participant crashed after taking the COMMIT decision. (COMMIT state). When co-ordinator take COMMIT decision participant enters COMMIT log. If participant crashes after entering commit log then to recovers, it sends decision to co-ordinator again. Il) Participant crashed after taking ABORT decision (ABORT state). Same as above (1) as there is log entry of abort it sends decision again to co-ordinator. I) Participant crashes in INIT state. Participant can safely abort locally when it recovers and accordingly informs the co-ordinator. 1V) Participant crashes in READY state. When participant recovers after crashing in READY state, it consult with other participant or co-ordinator. If co-ordinator is up it notifies the decision about transaction else the participant consults with other participants. The decision will be taken accordingly. Steps followed by participants in two phase commit protocol. enter INIT to local log wait for Request_VOTE from co-ordinator £ timerout ( enter ABORT_VOTE to local log? exit } Participant VOTE ABORT { enter ABORT_VOTE to local log Send ABORT_VOTE to co-ordinator } else { enter COMMIT VOTE to local log send COMMIT _VOTE to co-ordinator wait for FINAL DECISION from co-ordinator if timerout { multicast REQUEST DECISION to other participants wait until FINAL DECISION is received enter FINAL DECISION to log ) if FINAL DECISION = COMMIT_GLOBAL enter COMMIT { BAL to local log else enter ABORT GLOBAL to local log | i 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 5-45 Fault Tolerance If participant X is READY or PRECOMMIT be in INIT state. If participant X is in PRECOMMIT state then co-ordinator must also be in PRECOMMIT state. This means that it has received COMMIT_VOTE from all participants. If Y is in [NIT state the transaction can be safely aborted as no participant can be in INIT state if any of the participant is in PRECOMMIT state. If any of the participants is in COMMIT state then the X will make a transition to. COMMIT state and if any of the participants is in ABORT state then X will make transition to ABORT state. If all the participant to whom X can contact are in READY state they form majority then the transaction will be aborted. There may be participant who might have crashed and may recover later. But in which state the process will recover is not known, It may recover in INIT or PRECOMMIT or ABORT state. If it recovers to INIT state then aborting transaction is the right decision. If it recovers in PRECOMMIF state then aborting transaction may not do any harm. Thus in 3PC, the crashed participant receivers in INIT, ABORT or PRECOMMIT that is there will not be any crashed participant which will recover in state other than INIT, ABORT or PRECOMMIT while in 2PC crashed participant can recover in commit state. So in 3PC final decision will be taken. If all the participants to whom X can contact are in PRECOMMIT state and they form majority then transaction can be safely committed. 5.6 Recovery Even if failure occurs in a system then it must recover to correct state. We will discuss two methods of recovery in distributed system : Checkpointing, message logging. The system may fail because of the error. Recovery from the error is the basis of fault tolerance. The system must recover to error free state. We will discuss backward recovery, forward recovery. 5.6.1 Backward Recovery When the system is in erroneous state and correct state from the erroneous state, it is called backward recovery. The present state of the system is recorded that is checkpoint is made. Whenever failure occurs this checkpoint is used to restore to previously recorded state. Take a example, of reliable communication. The sender sends packet but they may get lost. In backward recovery the sender retransmit those packets. This restores the previous state. The previous state here is the sender is sending the packets (as they are lost). In distributed systems this approach is widely used as a general mechanism. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 5-49 Fault Tolerance In sender based logging, a process logs its messages after a checkpoint has been taken and before sending the messages off. In receiver based logging, a receiver process first log an incoming message before delivering the message to the application that it is executing. If receiving process crashes, then restore the most recently state for which checkpoint is made and from this checkpoint replay the already sent messages. When checkpoint and message logging are combined the state can be restored which lies beyond the most recent checkpoint. There are some approaches which additionally use logs. System which uses checkpoints and message logging is efficient. Consider a system using only checkpoints. In case of failure the state is restored to checkpoint state. The system behavior may be different than the system behavior was before the crash. Like message ordering may be different and accordingly the behavior of receiver may be different. But if message logging is used along checkpoints then event are replayed after the checkpoint For faulty user input if message logging is used alongwith checkpoint then events are replayed from older checkpoint. For faulty user input if only checkpoints then system is restored such that user input is taken again. Thus replaying the event makes it easy to interact with outside the system. 5.6.4 Checkpointing As we discussed in distributed system it is useful to know global state of the system. A global state consists of local state of each process in a distributed system and messages which are sent but not yet delivered. This consistent global state must be recorded called as distributed snapshot. If snapshot reflects that process X has received message then the process Y should exist which sends the message. Backward ertor recovery scheme is based on checkpoints. These checkpoints are stored on stable storage. Local state of each process is stored on stable storage from time to time in backward error recovery scheme. From these local state global state is formed. Recovery line as shown in Fig. 5.30 is the most recent consistent cut. Initial state x | Recover line Fig. 5.30 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 5-53 Fault Tolerance Message logging is based on the technique in which message logs are used to recover and to restore the globally consistent state. Thus important point here is when the messages are to be logged. We will consider approach given by Alvisi and Marzullo. The process which survives the crash of an other process and its state inconsistent with the crashed process after its recovery. Consider example of process P,, P; and P, with messages m, m’, m’. *——~ Logged message —— Unlogged message Fig. 5.32 Process P; sends message m and process PF; sends message m' to process P;. Thus process P; receives m’ and m message from PR, and Ps respectively. Process in turn sends message m" to process P. Out of m, m’ and m” messages only m’ is not logged while m and m" messages are logged. Let process P; crashes and recovers again. As m’ is unlogged message after recovery of process P, the transmission of m’ will not be replayed and so m" will not be retransmitted. Thus the state of process is inconsistent as compared to the state before recovery. The process P; gets the message m" before crash and transmission of the message m’ will not be replayed. It is necessary to avoid such inconsistency. Characterizing Message Logging Method We are considering approach described by Alvisi and Marzullo (1998). The information necessary for retransmission and handling it is a part of each message header. This information may contaih identity of sender, receiver, sequence number for recognizing duplication, delivery number for deciding when to hand over the message to application. The transmission of stable messages are replayed to recover. The message which is not lost is stable message. When the messages are stored on stable storage then messages are stable messages. A set DEPPROC (m) is a set containing all the processes that depends on the delivery of message m, Particularly all processes which has received the message are 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Case Studies | This chapter includes Case study 1. CORBA Case study 2. GRID i Case study 3. Cluster Case Study 1 : CORBA In this topic we will elaborate on the following topics ¢ Introduction to CORBA + Where does CORBA fit in distributed systems * Components involved in CORBA * CORBA architecture * Features of CORBA systems .1.1 Introduction to CORBA CORBA is a software standard that is defined and maintained by the Object lanagement Group (OMG). The OMG was a non-profit organization founded in 1989 eight companies. This consortium now compromises of over 800 members. OMG produces specifications, not implementations. Implementations of OMG fications can be found on over 50 operating systems. CORBA is the acronym for Common Object Request Broker Architecture. It msists of a standard framework for developing and maintaining distributed software items. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 6-5 Case Studies A “mapping” from IDL to a specific implementation language like C++ or Java is also provided by CORBA. Standard mappings are available for * Ada Cc © CH © Lisp © Smalltalk * Java * COBOL « PL/I * and Python Non-standard mappings are available for © Perl * Visual basic 2. The object request broker A communication infrastructure called as Object Request Broker (ORB) is used for the objects to communicate across the network. The features of the object request broker are : i) IDL interface isolates both client and object implementation from the ORB. ii) Clients see only the object's interface, never the implementation. iii) For communication, every request is passed to the client's local ORB. The features of the interface like : i) Location of the object ii) Programming language of its implementation ete. The client does not know any aspect of the interface. The interface the client sees is completely independent of that and is not reflected in the object's interface. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 6-9 Case Studies Object request interfaces ORB services like object reference generation are provided by the odject adapter. It is the primary means for an object implementation. Object adapters are responsible for the following functions : it generates and interrets object references. © It is responsible for method invocation. «It also provides the feature of providing security of interactions. * It does the basic object and implementation activation and deactivation. ‘© It Maps the object references to the corresponding object implementations. * It is also responsible for registration of implementations. Any process in CORBA whether a client or a server is an ORB. The ORB handles the basic communication between a client and the object. The object server consists of * Object adapter * Skeleton * Skeleton interface and ORB interface. The object adapter is responsible for carrying out the task of forwarding incoming requests to the proper object. If the object uses a skeleton then it has to provide the proper implementation of the invoke function. The invoke function is provided by the client. 6.1.5 CORBA Services The various services offered by CORBA are * Collection service ; As the name suggests it groups the object in either lists, queues, stacks ete. Access mechanisms are offered depending on the nature of the group. | © Query service : \ This service provides the collection of objects which are queried using a | declarative query language. | * Concurrency control servic Clients need certain advanced locking mechanisms to access shared objects. This is done by providing concurrency control service. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 6-18 Case Studies Replicas of referenced object (group) Fig. 6.16 IOGR The client passes an IOGR to the ORB. The ORB tries to bind to one of the replicas of the referenced object. 6.1.6.7 Security Secure object invocation is required to enforce security in CORBA. The security services must be hidden from the Binds application level objects. The client binds an object to invoke it. C cient) The client ORB decides about the security services that are needed at client side. Security policies are used for the C on >) selection of security services. Fig. 6.17 Concept used in Policy objects are used to specify security of CORBA security policies that are associated with the client. The client of the ORB receives the security policies from its domain of client. 6.2 Case Study 2: Grids 6.2.1 Intreduction Some commonly used definitions are : A grid is a virtual information processing environment where the user has an illusion of a seamless single source computing power which Is actually distributed. According to Gartner " A grid is a eollection of resources ewned by multiple ‘organizations that is co-ordinated to allow them to solve a common problem, 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 6-40 Case Studies The network interface hardware acts as a communication processor. Transfer of packets between cluster nodes is the responsibility of network interface hardware. Fast communication between cluster nodes and outside world is done with the help of communication software. Communication protocols like active messages are used by myrinet. Critical overheads are reduced by bypassing the operating system. The cluster middleware offers an illusion of a single system. 6.3.3 Commodity Components for Clusters In the current era, network performance has improved, standardized programming APT's are available, This is the reason of the increasing demand of cluster based computing. The various commodity components used for clusters are 1) Hardware components 2) Software components 3) Midleware components 1) Hardware components The commonly used hardware components for building cluster and nodes are a) Processors. b) Disk and 1/0. ©) Memory and cache. d) System bus. e) Cluster interconnects. 2)-Software components The commonly used software components for building cluster and nodes are a) Operating system b) Network services / communication software 3) Middleware components We'll ellaborate each of the above in details. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 6-44 Case Studies: Fig. 6.34 shows the remote communication procedure in SCL. Fig. 6.34 Remote communication in SCI Though SCI has fast distributed shared memory access, it is still not the most popular technique used. This is because its components are very expensive. * Myrinet Myricom supplied a 1.28 Gbps full duplex interconnection network called Myrinet. The features of Myrinet are i) It supports both linux and NT. ii) It is a high performance interconnect. iii) Fault tolerance is offered by automatic mapping of network configurations. This is possible as Myrinet uses low latency cut through routing switches. iv) As compared to fast ethernet, it offers very low latency and a very high speed. v) Itis relatively expensive as compared to fast ethernet. * ATM Asynchronous Transfer Mode is a technology specially developed for the telecommunication industry. Features of ATM : i) The international ATM forum is a non profit organization. ii) The international telecommunications’ union has developed certain standards and protocols which are being adopted by ATM. iii) ATM is used for both LAN and WAM. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 6-51 Case Studies Processors in Beowulf communicate through TCP/IP over the ethernet internal to cluster, User transparent access to multiple parallel ethernet networks is provided to every workstation in the Beowulf. In the Beowulf there are multiple ethernet networks working in parallel. Therefore the performance of interprocess communication is limited by the performance characteristic of the ethernet. Beowulf project has two Global process ID (GPID) schemes : PVM and GPID-PVM. There are no external libraries in PVM. GPID-PVM uses PVM as its signal transport. These two schemes offers cluster wide control and signalling of processes. Now a days the Beowulf project is developing mechanisms. These will help unmodified versions of standard UNIX utilities to work across a cluster. 6.3.6.2 Solaris MC Project Solaris MC is a distributed O.S. for multicomputer. The cluster appears as a single system machine to the user. The cluster computing nodes are connected by high speed interconnect. The various components of solaris MC architecture are - 1) Existing solaris 2.5 kernel 2) The solaris MC consisting of © Network * File system * Processes © CH © Object framework 4) System call interface and 5)_Various nodes in the cluster and outside. Fig. 6.35 Solaris MC architecture 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. (6 - 56) 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems A-5 Appendix -A Handling of orphan calls in i) Last-one call semantic . To achieve last-one semantics, the orphan calls are terminated before restarting the crashed processes. This is done in two ways. * The orphan calls are allowed to finish. ° They are traced down and killed. This is called as orphan extermination. ii) Last of many call semantics In this the orphan calls are neglected. They are neglected in two ways. * A call identifier is used to uniquely identify each call and then neglect the orphan call. * A caller will accept a response message only if the identifier of the most repeated call matches with the current call identifier. At least one call semantics * It uses timeout based retransmissions without caring for orphan calls. * For nested calls, if there are any orphan calls, only the response of the first message is considered and all the other messages are ignored. Q.3 a) What are the different design issues in RMI ? Explain each in brief ? Ans. ; Local invocation methods extension is remote method invocation RMI. Design issues in RMI + The remote invocation methods may not executed exactly once as the local invocation method. This is a strong issue to be handled during designing of RMI. * To achieve desired level of transparency is again a matter of concern designing of RMI. 1) RMI invocation semantics Clients : Process that host their invokers. Servers : Process that host remote objects. Servers can also be clients. A method called as dooperations is used by clients for remote invocation. This method can be implemented in variety of ways. a) Retry request message It is to be decided whether the requested message is to be retransmitted until a reply is received or an assumption is made that the server failed. b) Duplicate filtering Decision regarding filteration of duplicated requests at the server is to be made. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems A-9 Appendix - A Event : Actions like manipulating a button, entering text etc. are considered as events. They make changes in the objects that maintain the state of application. In distributed system, there are multiple objects these objects are at different locations and they are be notified about the occuring of the event at an object. Publish-subscribe paradigm : In this the object performs following three functions. 1) Generates events. 2) Publishes type of events. 3) It makes the event available for observation by other objects. Event notification : Object that represent events are called as event notifications. Characteristics of distributed event based systems, a) Heterogeneous : Distributed systems itself has the property of Heterogeneity. So the objects used for event notification must also support heterogeneity. It does this by : + The event generating objects publish the types of events they offer. ‘* Other objects subscribe to events and provide an interface for receiving notification. b) Asynchronous : Event generating objects send notifications to all the objects that have subscribed them. This communication is in asynchronous way. This also avoid the publisher to synchronize with subseribers. Events can be of different types. Information about the event like name, identifier of the object that generated it etc, are represented by attributes of the events. Events use both for subscription. * Types and © Attribute. The interested parties will be notified on the occurance of an event that matches with and type and attributes. Role of participating objects in distributed event notification Fig. 4 shows the architecture of distributed event notification. The architecture for distributed event notification is designed in such a way that publishers and subscribers are decoupled from each other. Event service is the main component of the architecture. The database of published events and of subscribers event is maintained by the distributed system. The notification is sent to the subscribers of the event, when that type of event occurs. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems A-14 Appendix -A perform mutual authentication between the client and server. Thus secure logical communication is possible because of the ticket granting server. Authentication server : As its name, it provides the authentication of the users while logging on to the system. iii) Authentication database : It has valuable information in its database. Thus its design and security is a key aspect for Kerberos. There should be limited number of users with permissions to access this database. b) Client : The Kerberos system also consists of client processes. These processes usually run of workstations. In a distributed environment, the workstations may be physically at a longer distance. Obtains ticket Grants ticket of required server Fig. 6 The user sends its identity to the Kerberos server. It verifies the identity and then the client process is allowed to obtain a ticket from ticket granting server. The TG server issues the ticket of required servers. Now communication between the client and the required server begins. ©) Application server : The application server provides the required service to the client on verifying the authenticity of the client. Q. 7 a) What is the necessity of nested transactions in distributed systems ? What are the benefits of nested transaction state the rules for commitment of nested transaction ? Ans. : Refer section 4.9. b) State and explain the drawbacks of locking. How optimistic concurrency control takes care of drawbacks of locking ? Ans. : The operations involved in the transactions are normally read. and write. The conflict rules for read and write are i) read read : There is no conflict because the order of the read operations is immaterial. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems As Appendix - A 4) Interoperability : The open systems can work together. This feature allows two implementations of the system or components. Both of them rely on cach others services as specified by a common standard. 5) Portability : It is an ability to transform an application from one software or hardware platform to another. b) Scalability can be achieved by applying different techniques. What are these techniques ? Ans. : Scaling techniques : In the distributed systems the scalability problem are actually performance problems. These problems arise because there are limited number of servers and network. There are three techniques for scaling 1) Hiding communication latencies 2) Distribution 3) Replication The techniques must basically satisfy scalability issues. ie. scalability means that the system should remain efficient with a significant increase in the number of users and resources connected a) the cost of adding ‘the resources must be reasonable, b) performance loss with increased number of users and resources must be controlled. ©) software resources must not run out. We'll discuss each of the technique in detail. 1) Hiding communication latencies : This technique is used when geographical scalability is an issue of concern. In distributed systems which are geographically far located, the communication is mostly synchronous i.e. the client waits till the server responds. This technique says that we should avoid the waiting for the servers response. This time could be utilized for some efficient work. ie. the system should necessarily use asynchronous communication. In this type of communication, the requesting application does not wait for the reply from the server when a reply comes in the application is interrupted. The previously issued request is completed by a special interrupt handler which is called on receipt of the reply. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems A-44 Appendix - A The NFS protocol consists of Remote procedure calls to communicate with client server. The client can access the files by system calls generated in the application program. ‘The virtual file system provides the main transparency feature of the distributed system. Because of virtual file system, the user programs can have operations for local as well as remote files without any distinction. The currently available file systems track is being kept by virtual file system. From the virtual file system layer, operations can be passed to access remote file on a remote machine or a file on the local machine. To access the files stored on the remote server, the virtual file system communicates with the NFS client. The NFS client communicates with NFS server. The NFS protocols use the RPC'S to have communication between client and server. On the server side, the incoming client requests are handled by NES server. The NFS server converts the requests to processing formats and passes it to the virtual file system layer. ‘The virtual file system implements a local file system in which actual files are kept. Q5 a) Describe any two RPC exchange protocols. Ans : Message passing is the basic communication media used in distributed systems. * RPC : When a process on machine X calls a process on machine Y, the calling process on machine X is suspended. The procedure which is called now executed on machine Y. The results of this process are returned to process X and then the suspended process resumes This method isécalled as an RPC. (Remote procedure call). A call B A suspends B executes Returns the results A resumes 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 52 ix-A 7. Security * Ameoba : Resources are named and protected by capabilities. Access control mechanism is capability based. * Mach : Access control mechanism is based on port rights. * Chorus : Access control mechanism is capability based. 2. 7 a) Describe UNIX emulation in Mach and chorus, Ans. : UNIX emulation in Mach : Ameoba was not made as an existing system (UNIX). It was developed from scratch. So the design approach has no problem to worry about the backward compatibility of the version. New ideas can be enhanced during design. Due to this, the interface of Ameoba resulted to be very different than that of UNIX. A UNIX utility package was added to Ameoba to avoid the need of writing hundreds of utility programs. Because of this UNIX programs can run of Ameoba with little or no modifications. UNIX emulation in Chorus : The design goal of chorus was to provide UNIX compatibility. For version 2: The original kernel of chorus was redesigned and converted to a micro kernel. It was done by moving its much of the functionality to address space. Later on serveral processes were added to the were address space to do UNIX emulation. For version 3 : In version 3 binary compatibility with unix system 5 was provided by building a UNIX emulation subsystem called chorus/MIX (Modular Unix). This was built on top of chorus microkerel. b) Define the terms : Co-ordinated universal time and clock drift. How does compensation take place for clock drift. Ans. : Definition : Co-ordinated universal time : An external time source that is used as a reference for synchronizing computer clocks with real time is the co-ordinated universal time (UTC). Clock drift : The crystal based clocks in computers are subjected to clock drifts i.e. they count time at different rates and so they diverge. The compensation in clock drift is achieved by various clock-synchronization algorithms. The compensation in clock drift can be centralized or distributed. We will elaborate only centralized algorithm in detail here. For centralized algorithm refer to 4.7.1. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems A-59 Appendix - A This call returns an identifier by which the socket can be identified for further operations. The socket continues till every process within the descriptor exits or until it is closed. The processes bind the soctet descriptors to socket addresses before communication. The bind system call is used for this process. 1) In datagram communication : To send a Message a) S = Socket (AF_INET, Sock-DGRAM, 0) b) bind (S, client address) c) send to (S, “message”, server address). To receive a message a) S = socket (AF_INET, Sock-DGRAM, 0) b) bind (S, server address) c) amount = recv from (S, buffer, from). The server addresses and the client address are socket addresses. 2) In stream communication : To request a communication : a) S = Socket (AF_INET, SOCK-STREAM, 0). b) connect (S, server address) c) write (S, “message”, length). For listening and accepting a connection : a) S = Socket (AF-NET, SOCK-STREAM, 0). b) bind (5, server address). c) listen (S, 5) d) SNew = accept (S, client address) e) n= read (SNew, buffer, amount). Q00 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems A-64 A d) Interface e) Unique use ID f) Filestore. ZOzzZ00 Cluster of workstation Fig. 5 In a workstation model, each user has a workstation. There can be clusters of workstation. Application programs run on Workstation All workstation share common interface. Specialized servers are provided by the servers in the workstation model. Common instruction used by all workstation and servers. 2) Processor Pool Model In a workstation model, there may be a case that several workstations are idle. They can be completely ignored, but it is not an intelligent way of working. One can = Utilize the idle processors by the other OS. on the system. = This can be done manually. User can move the processes on the idle machine. In this a collection of processes and the CPU is dynamically assigned to processes on demand. Fig. 6 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems A-69 Appendix - A Every day the sun appears to rise~on~the eastern horizon, limits to maximum height in the sky and sinks in the west. The event of the sun's reaching its highest apparent point in the sky is called the transit of the sun. This event occurs at about noon each day. The interval between 2 consecutive transits of the sun is called the solar day. Since there are 24 hours in a day, each containing 3600 seconds, the solar second is defined as exactly 1/86400" of a solar day. The geometry of the mean solar day calculation is as below : Earth's orbit At the transit of the sun n days Later, the earth has rotated fewer than 360° To distant galaxy To distant galaxy On day nat the transit of the sun Fig. 11 b) Write the Berkeley algorithm for clock synchronization in pseudo C. Explain it with suitable example. te) Ans.: Refer page no. 4-5. ©) — Compare Bully algorithm with Ring algorithm. {6] Refer . 4-15 to 4-18. if page no. Oo OR Explain Ring algorithm with suitable example l6) This ring algorithm does not use a token. Assume that the processes are physically or logically ordered, so that each process knows who its successor is, This is an election algorithm. 1) When any process notices that the co-ordinator is not functioning, it builds ‘on election message containing its own process number and sends the message to its successor. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems A-75 Appendix - A OR Q.10 a) How reliable group communication achieved by different techniques? Is} Ans.: Refer page no. 5-24. b) Explain design issues for fault tolerance in distributed systems . (3) Ans.: Refer page no. 5-8. Q.11 a) What is Portable Object Adapter in CORBA? {8} Ans.: Refer page no. 6-4 and 6-8. b) Explain difference between Grid and Cluster. {8} Ans,: Refer page no. 6-18 and 6-13. OR Q.12 a) Explain IOP in CORBA. 18] Ans.: Refer page no. 6-11. b) What is Virtual Organization (VO) concept in GRID? {8} Ans.: Refer page no. 6-25. g00 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems B-3 Appendix - B Server load balancing addresses several increasingly important in networks: « Increased scalability requirements that are becoming! « High performance « High availability and disaster recovery Many content-intensive applications have scaled beyond the point where a single server can provide adequate processing power. Both enterprises and service providers need the flexibility to deploy additional servers quickly and transparently to end-users. Server load balancing makes multiple servers appear as a single server ~ a single virtual service - by transparently distributing user requests among the servers. The highest performance is achieved when the processing power of servers is used intelligently. Advanced server load-balancing products can direct end-user service requests to the servers that are least busy and therefore capable of providing the fastest response times. Necessarily, the load-balancing device should be capable of| handling the aggregate traffic of multiple servers. If a server load-balancing device becomes a bottleneck it is no longer a solution, it is just an additional problem 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems B-8 Appendix - B When the new connection request arrives at load balancer module, it will allocate it to one of the backend server selected based on balancing algorithm. In this way connection request are generated and allocated to one of the server. At both side that is load balancer and each backend server, the various tables are maintained, The load balancer status table contains : Server 1D Identification code of a server Port Number Port for communication on a server No. of Active connections Current count of connections on the server Total Load Login Time Time when server have been logged in The load balancer mapping table contains : Connection ID Unique identifier for a connection [Randomly generated] {Server ID, Port ID} Identifier for identifying a service on a server. The back end server status table contains : Connection 1D Unique Identifier for a conne Execution (in %) Percentage of execution Request Weight Priority for requests for a client Connection duration Time till the connection is alive Balancing Algorithm The load balancing algorithm is based on bidding system. In simulation every backend server is given a specified capacity in form of various performance measures like memory, processing speed etc. These capacities can be configured through main. interface module. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 8-14 Appendix - B e JLabel Ibmemory, Ibspeed, Ibotr; private JComboBox cbmemory, cbspeed,cbotr; private static Doublef} speed,memory; private static Integer{} other; [private Chart chartObj; {private Boolean running; /Iprivate Thread tchart; //threads private ChartThread tchart; private GraphThread tgraph; // supporting classes private Client client; private Statisties stat; // capacity values private double epuzmem; private int res; static memory = new Double[6]; memory{o] = new Double(128); memory[1] = new Double(256); memory(2] = new Double(512); memory{3] = new Double(1024); memory[4] = new Double(2048); memory[5] = new Double(4096); speed = new Double(6]; speed{o] = new Double(500); speed(t] = new Double(600); speed{2] = new Double(733); speed[3] = new Double(866); speed[4] = new Double(933); speeds] = new Double(1000); speed[6} = new Double(1100); speed[7] = new Double(1300); speed[8] = new Double(1400); speed(g] = new Double(1660); speed(10] = new Double(1800); speed{11] = new Double(2000); speed{12] = new Double(z200); speed[13] = new: Double(2400); speed(14] = new Double(2800); speed(15] = new Double(3200); 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 8-24 Appendix - B { System.out printin(*File Error"); ty fout.close(); Cate bOOreption ie) System.out.printin(*File Closing Error"+ie); cardLO.show(pandis,"Conf); Ds } public void readConfigDetails() { FileInputStream fin ; String dataVal = ° ; int i; try { fin = new FilelnputStream(“configure.txt”); }eatch (FileNotFoundException e) t System.out.printin(“File not Found” +e); return ; i = fin.read(); if( -1) //System.out.print( (char) i ) ; dataVal += (char)i; }while( i != -1); fin.close(); }eateh(1OException ie) 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems B-29 Appendix -8 Chart Thread /fimport java.awt*; /fimport java.awt-geom.”; //import java import java.util. class ChartThread implements Runnable private boolean Running; private Thread t; private doublef] data; private String{) label; private Random r; private ChartPanel p; ChartThread(ChartPanel _p) { t = new Thread(this,"User Thread"); tsetPriority(Thread.MAX_PRIORITY-1); Running = false ; P=_Pi r = new Random(); i //data = new double[6); } public void run() while(Running) ¢ try { Thread.sleep(1000); catch(InterruptedException e) { System.out printin(“ error in :” + e ); System.out.print(label{i]+","); ConnLoad cl=(ConnLoad)Statistics.connStatus.get(cID); if(cl!=null) data[i]=cl.cpuBt; 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems B-36 Appendix - B case SERVER_LOGIN : bServerName=msg.getComment(); bServer.setTitle(bServerName); break; case SERVER_LOGOFF: stopBServer(); break; case SERVER_LOAD: prepareBid(SERVER_LOAD); break; case BID_REQUEST :bServer.display(“\n ******* Preparing Bid.... prepareBid(BID_RESPONSE); break; case CONN_NEW: —_ openConneetion(msg); reak; case CONN_OLD: __ sendToConnection(msg); break; case CONN_CLOSE ; closeConnection(msg); break; public void openConnection(Message msg) { Request req= (Request)(msg.getObject()); int conID=reg.getConnIDQ; System.out.printin(“\nin openConnection, msg= "+msg.toString()); if(tconnectionTab.containsKey(new Integer(conID))) Connection con = new Connection(bServer,msg); addConnection(con,conID); System.out.printIn(“in openConnection, added = ”+conID); bServer.display(*\n Connection added = "+conID); } else sendToConnection(msg); } public void sendToConnection(Message msg) 4 Request req= (Request)(msg.getObject()); int conID=req.getConnID(); 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 8-41 Appendix - B + return epuBt } // called by client class to close connection synchronized public void closeConnection() { done=true; //System.out.printIn(“\n\tin closeConnection()(done="+done+*)"); } // execution of connection public void run(Q) // loop untill connection lifetime while(tdone) { bServer.display(“\nConnID#”+connID+" — \tepuBt :"+epuBt); uyt Thread.sleep(500); }eateh(InterruptedException ie){ } cpuBt- - ; 0) if(epuBt % 2 { Statistics setConnectionStatus(connID,cpuBt,memWt); bServer.display(“\n—Load = "+Statisties.normLM); } while(cpuBt == 0 & done) { } System.out printin("\n \t Connection Closed *+connID); bServer.display("\n \t Connection Closed "+connID); 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems B-46 Appendix -B } public void startThreadQ) { tstartQ ; import java.awt.*; import java.awt.geom.*; import javax.swing."; public class GraphPanel extends JPanel { private int{] dataValues ; private double maxV: private int(] xAxis; public GraphPanel() { xAxis = new int[goo]; dataValues = new int[300]; setBackground(Color.white); setBorder{ BorderFactory.createTitledBorder(“sample Graph”) ; for( int i = © ; i < dataValues.length ; i++ ) dataValues{i] = 220 ; for( int i = oj=4o ; i < dataValues.length ; i++j+=1 ) xAxis[i] = jj } public void setData(int value ) for( int i = 0 ; i < dataValues.length-1 ; i++ ) dataValues[i] = dataValues[i+1] ; 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems B-54 Appendix -B * Client Simulator This sub-module handles the client side simulation and generates the client request randomly. Press the START button to start Client Simulation. Client simulator window. onnecton kt 6, publ 17 nem W110 onnecton_it 6, epust 7 mem Wi 74 connection id 1, puBt 3 soem ¥A16 connec bon 1, epu8t 13 nem W122 connec bon_a 3, cpusl & mem ¥t17 connechon_id: 1, ¢puGt @ mem ve 23 connecbon_ia: §,¢puGk 7 mem Wd connecton.id connecton_id eonnecton_d connection_\d 8, epubt 10 mem 1¢ connecbon_ id 3, cpuBt 14 mer 1 24 -ontecbon_ id 3 epuBt 4 mem Vit 29 onnecbon i, :puBt 10 mom W126 2. cpu 4 mem et? onnecton ld: 10, cpubt 17 mer A 25 Connecton. jf 2, ¢pUBt 10 gman wt 23 connection 1d. 6, cpuft 7 mera Wt? eonnacton_id 10, cpubt 13 mem VA 1§ onnecton_d2, epuBt mera a8 ‘onnecton id 7. epu@t 10 mera ¥t25 onnecton 9. ¢paBt 12.marn W422 onnachan_t: 10, epube 18 mem ve 16 connection jd. 9, ¢puRt 3 mem vt 4 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems B-58 Appendix -B Future Extensions The current system calculates the loads on backend servers and transfers the request to server having minimum load on it. The selected server is the best choice to process the request The current system can be further optimized by : 1) Actually our load balancer is stateful load balancer. Means if the request from the same client process occurs then it is given directly to same backend server. without calculating load. This can make system bit unbalanced if there are more request fer a particular client process. So if we make the Load balancer stateless then it will be totally balanced system, but this depends on the type of client process. 2) The second thing is that to find the load we send the bid request and wait for response from client, this increases the time duration to transfer the client request to backend server. Also it increases unnecessary network traffic. So what we can do is that if the back end servers are dedicated severs that means no other it is not serving to any other its own or other computers request then we can easily calculate the load on any dedicated back-end server from load balancer itself, which will remove unnecessary overhead. 3) In current system we haven't handled an issue regarding fault tolerance. Suppose any backend server crashes during processing the client request then we can not transfer the requests currently being processed by that backend server to any other server. This issue should be handled in future extension of| the current load balancing application. Software Used OC Microsoft® Windows™ XP Professional Build 2600 G Microsoft® Office™ 2003 Professional System G Xinox® JCreator™ 3.51 - [An Integrated Development Environment for Java] © Sun Microsystems® Java™ 2 Standard Edition 5 Update 6 © Mozilla Firefox 1.0.5 © Putty © Acrobat® Reader™ 7 Load Balancing Project. A. Core coding :Vishal Patil - Project Leader Rajesh Gooda B. GUI :Adwait Hatkar ,Akshay A Kulkarni ,Prashant Mane C, Algorithm for balancing :Raj Parakh ,Nikhil Pate Prasad Mohire D. Documentation :Rahul Patil ,Uday Gaikwad 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems B-62 Appendix - B RFS can NOT be used to boot diskless clients. RFS does NOT support symbolic linking or automounting. RFS does NOT support mounting of a directory on top of an existing RFS directory. RES does NOT support directory sharing. Replication : Replication refers to the use of redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or performance. Replication typically involves replication in space, in which the same data is stored ‘on multiple file systems or the same computing task is executed on multiple devices, or replication in time, in which a computing task is executed repeatedly on a single device. There are two approaches to replication in Distributed Systems, active and passive replication. Active Replication, also known as State Machine Replication, is performed by processing the same request at every replica. In Passive Replication requests are usually processed on a single replica and then the state is transferred to the other replicas. If there is only one machine that processes the requests, then we are talking about the primary-backup scheme. On the other side, if any machine can process a request, then we have a multi-primary scheme. 1, Replica Management i) Permanent Replicas ii) Server Initiated Replicas (Push Caches) The algorithm for dynamic replication takes two issues into account. First replication can take place to reduce the load on server. Second specific files on server can be migrate or replicated to servers placed in proximity of clients that issue many requests for those files. When the number of requests for specific file F at server $ drops below a deletion threshold del(S,F), that file can be removed from S, Special measures are taken to ensure that at least one copy of each file continues to exist. A replication threshold rep(S,F), which is always chosen higher than deletion threshold, indicates that the number of requests for specific file is so high that it may be worthwhile replicating it on another server. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems B-68 Appendix -B 3) Switch request type- 1) case OPEN_FILE_REQUEST : a b °) i) id k) Db accept file name accept file open mode if it is valid file lock mutex check for file lock open the file with file pointer, in read mode only add file to client file table if previous file lock was NO_LOCK then notify other servers about file lock change lock the file in requested mode increase reference count send successful ACK send file ID for further reference. m) unlock the mutex 11) case CLOSE_FILE_REQUEST: a) b) ©) da) e) p accept file ID check if it's valid file id check if the client if authenticated to close the file now close the file delete client file table record send ACK Til) case READ_FILE_REQUEST: a) b) °) 4) e) accept file ID check if it’s valid file id send invalid file ID error code check if the client if authenticated to read the file now send the file in blocks of MAX_BUFF bytes TV) case WRITE_FILE_REQUEST: a) b) accept file ID check if it’s valid file id 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems B-73 Appendix - B public: Server_file_tabie( CString fname, int fid ); void putRecord( CString fname, int fid 9; int getRefCount(); int getFileLock(); const char * getFileName(); void lockFileMutex(); void unlockFileMutex(); void setFileLock( int lock ); int decreaseRefCount(); int increaseRefCount(); Server_file_tuble * getNextQ; vold setNext(Server_file_table *); {nt getFileIDQ; ~Server_file_table(); k sNrHe ne eene on nReneHENONe: Seen eeee i SBRVER_FILE_TABLE method definitions Jakeeonaetensnenenenananatcastsssesnees transareseneserenenentisnsnsnesenteee hewennee sabnaeaswenearesesensaen: arene betoaver 704, CString fname, int fid ) putResord(fname, fid); next = NULL; void Server_file_tablenputReeord( CString frame, int fid ) fileName 4 fnaine j eB = fid j reloreneeGouint = 0 j 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems 8-77 Appendix - 8 return sd ; RFS_server::RFS_server() } SOCKET RFS_server::RFS_connect() SOCKET servSock; cout << “Looking up address...” << flush; u_long nRemoteAddress = LookupAddress(host); if (nRemoteAddress == INADDR_NONE) status = OFFLINE ; cerr << endl << “lookup address error"<< endl; return INVALID_SOCKET; else in_addr Address; memepy(&Address, &nRemoteaddress, sizeof(u_long)); } cout << inet_ntoa(Address) << *:" < SERVERPORT << endl; // Connect to the server cout << “Connecting to remote host...” << flush; servSock = EstablishComnection(nRemoteAddress, htons(SERVERPORT)); if (servSock == INVALID_SOCKET) { status = OFFLINE ; cerr << endl << “connect to server error” << endl; return INVALID_SOCKET; else 4 int res{a] = {SERVER_CONNECTION} ; //send connection type send(servSock, (char *)res, sizeoftint), 0): cout << “connected, socket ” << servSock << “.” << endl; return servSock ; 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. Distributed Systems B-81 Appendix - B // connect to other servers for file transfer for( int i = 0 ; i < TOTAL_RFS_SERVERS ; i++ ) sock{i] = INVALID_SOCKET; if( servers[i].status == ONLINE ) sock{i] = servers{i].RFS_connect(); } //now send fileupdate notification to other servers servReq[o] = FILE_UPDATE_NOTIFICATION; for(i = 0 ; i < TOTAL_RFS_SERVERS ; i++ ) if ( sock[i] != INVALID_SOCKET ) if ( send(sock{i], (char *)servRegq, sizeof(int), 0 ) == SOCKET_ERROR ) closesocket(sock(i]); sock[i] = INVALID_SOCKET; continue ; b //send file name if ( send(soek{i], fileName, (strlen(fileName)+1), 0) == SOCKET_ERROR ) closesocket(sock{i}); soek{i] = INVALID_SOCKET; continue ; } } } // now read the file and transmit do { n = fread(buff, sizeof(char), 1024, fp); for( int i = 0 ; i < TOTAL_RFS_SERVERS ; i++ ) if ( sock{i] != INVALID_SOCKET ) if ( send(sock[i], (char *)buff, n, 0 ) == SOCKET_ERROR ) { closesocket(sock{i)); sock[i] = INVALID_SOCKET; continue ; 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. 4@ ‘You have either reached 3 page that is unavailable for viewing or reached your viewing limit for this book. fibout the Author Rs. 310/- ISBN 978-81-8431-404-5 Technical Publications Pune | Ill | ji # 1, Amit Residency, 412 Shaniwar Peth, Pune - 411030, M.S., India. Telefax : +91 (020) 24495496/97, Email : technical@vtubooks.com Visit us at : www.vtubooks.com

You might also like