Kuvempu University M.Sc.

(IT)-4th Semester MSIT-42 Distributed Computing (DC) Max Marks: 50

Assignment TA
Note: All Questions are Compulsory 1. Select any case study of distributed application. Survey the case study with respect to distributed computing aspects. Analyze and understand applicability of distributed computing. - 10M 2. Select any case study of distributed application. Survey the case study with respect to challenges of distributed computing aspects. Analyze challenges like heterogeneity, openness, security, scalability, failure handling and concurrency. - 10M

3. Conduct a detailed study on commercially existing middleware products used in distributed environments.10M 4. Conduct a detailed study on commercially existing firewalls products used in distributed environments. 10M

5. Select 2 case studies based on centralized and distributed system applications respectively. Analyze and understand applicability of distributed and centralized systems in detail. 10M

Kuvempu University M.Sc. (IT)-4th Semester MSIT-42 Distributed Computing (DC)

Assignment TB
I. Fill in the blanks: 1) Mobile computing is also called as pervasive computing. 2) The two kinds of system models are ____________ and _________________ 3) OMG stands for Object Management Group 4) DCOM stands for Distributed Component Object Model 5) User Datagram Protocol is the expansion for UDP 6) IP addressing has Five number of classes. 7) Sockets are software abstractions of Ports used within running processes. 8) Java provides Three basic socket classes. 9) The expanded form of RMI is Remote Method Invocation 10) DFS stands for Distributed File System 11) Files are an abstraction of Permanent storage. 12) Hardware activities are co-ordinated using Physical clocks. 13) External synchronization is also known as Pull Model 14) A transaction run on Shared entities. 15) A Software clock is maintained by the operating system from the hardware clock. II. State True or False: 1) Resource sharing is an advantage of distributed system  True 2) UDP is a connection oriented protocol. False ( it is connectionless Protocol) 3) Routing metrics are used to measure the efficiency of the routing algorithm.  True 4) Client programs should see uniform name same in location transparency.  True 5) Vector clock gives more accurate logical clock when compared to Lamporte logical clock.  III. Answer the following questions: 1) What is distributed system? Ans: A distributed system is one in which components located at networked computers communicate and co-ordinate their actions only by passing messages, Distributed System is the interactions between two or more computer-based devices connected in some fashion. This is a rapidly expanding field, which deals with two primary issues: how do the devices communicate, and how do they interact. The former issue requires some network and communications software, and the latter requires some distributed software that allows the components on the two or more devices interact in some manner Eg: - Network of workstations - Distributed manufacturing system (e.g., automated assembly line) - Network of branch office computers

- The Internet - An intranet - Mobile and ubiquitous computing 2) Mention the subareas of middleware. Ans: Reflective Middleware Reflective middleware is concerned with applying techniques from the field of reflection in order to achieve flexibility and adaptability in middleware platforms Event-Based Middleware Event-based middleware is concerned with the concepts, design, implementation, and application of services and components that support building event-based systems Object-Oriented Middleware Object-oriented middleware extends the object-oriented programming paradigm to distributed systems Message-Oriented Middleware Message-oriented middleware is the natural extension of the packet paradigm of communications prevalent in the lower layers of the OSI network model 2) What is data marshalling? Ans: Marshalling is the process of taking a collection of data items and assembling them into a form suitable for transmission in a message. Un-marshalling is the complementary process of reassembling the data structure at the destination. Heterogeneity means that data formats at the two ends of communication may not agree. The problem is overcome by using an external data representation format such as CORBA’s CDR or SUN NFS’s XDR. The transmission format need not be binary: HTTP uses ASCII text 3) What is client-server communication? Ans: Client/server communication requires some form of request — reply protocol 4) What is group communication? Ans: Multicast is the term used to denote communication to a predefined group of processes and is used when it is more appropriate to send a message to a group than to processes individually. It is closely linked to replication, which may be used to enhance reliability and/or performance

PART - B 1. a) Explain the properties of distributed system. Ans: 1.5.1 Naming The names that are assigned to resources or objects must have global meanings. We want names to be independent of the locations of the actual object they refer to. We will need to support a name interpretation system that can translate names in the order to enable programs to access the named resources, which itself is a distributed application that provides services that must scale and translate efficiently, so that the same name works everywhere. 1.5.2 Access Access here refers to the ability for the same services and APIs to be usable everywhere and have reasonable performance. For example cc invokes C compiler on any platform or operating system. Global access also includes some idea of data coherence. 1.5.3 Communication The performance and reliability of the communication techniques used to implement distributed systems are critical to their performance. A design issue is, therefore, obviously to optimize the implementation of the communication protocols, whilst attempting to retain a relatively high-level of abstraction of application programming interface (API). 1.5.4 Software Structure A number of keys issues relate to software structure. These issues range from openness, which is achieved through the design and construction of software components with well defined, an open, interfaces. One overriding issue is the design of an overarching extensible framework that allows new services be introduced that can fully inter-work with existing services without duplicating existing services. 1.5.5 Resource Management We are concerned with the deployment of the processing and communication resources in the network to optimum effect for processing a dynamically changing workload. Here the issues of interest are resource management and scheduling. We want to load the processing, communications and other infrastructure fairly between the users. 1.5.6 Consistency Maintenance Distributed information and data means that the maintenance of consistency in the distributed system is important. Consistency will be of concern when there is concurrent access to data, where data is replicated for fault tolerant purposes, cached for performance, and mirrored for high availability. Consistency also refers to time and user interfaces. 1.5.7 Security Security is a basic building of all distributed systems. Classic security concerns deal more with data:
Confidentiality – data only available to those authorized, Availability – you can get it when you want it, Integrity – data has not been changed. Trust – who you are and what you are authorized to do Non-repudiation – you cannot deny doing something you did

Additional concerns deal more with people and transactions:


Auditability – I can check what you did to the data Reliability – the system does what I want, when I want it to Privacy – within certain limits no one should know who I am or what I do.

b) Explain different challenges that should be considered while designing a distributed system. 2. a) Explain any two types of architectural models. There are 3 types of architectural models - Client-Server, Peer process Client –Server Model :- Refer to page 18, 2.2.1 Peer process : - Refer page 20, 2.2.4 b) Explain different types of fundamental models. Ans: FUNDAMENTAL MODELS A model contains only the essential ingredients that we need to consider in-order to understand and reason out some of the aspects of a system’s behavior. The aspects of distributed systems that we wish to capture in our fundamental models are intended to help us to discus and reason about: Interaction model: Computation occurs within the processes; the processes interact by passing messages, resulting in communication (i.e. information flow) and coordination (synchronization and ordering of activities) between processes. In the analysis and design of distributed systems we are concerned especially with these interactions. Failure model: The correct operation of a distributed system is threatened whenever a fault occurs in any of the computers on which it runs (including software faults) or in the network that connects them. Our model defines and classifies the faults. Security model: The modular nature of distributed systems and their openness exposes them to attack by both external and internal agents. 3. a) Explain the differences between static and dynamic routing Ans: Static routing Static routing is not really a routing protocol. Static routing is simply the process of manually entering routes into a device's routing table via a configuration file that is loaded when the routing device starts up. As an alternative, these routes can be entered by a network administrator who configures the routes manually. Since these manually configured routes don't change after they are configured (unless a human changes them) they are called 'static' routes. Static routing is the simplest form of routing, but it is a manual process. Use static routing when you have very few devices to configure (<5) and when you know the routes will probably never change or rarely change.

Static routing also does not handle failures in external networks well because any route that is configured manually must be updated or reconfigured manually to fix or repair any lost connectivity.

Dynamic Routing Dynamic routing protocols are supported by software applications running on the routing device (the router) which dynamically learn network destinations and how to get to them and also advertise those destinations to other routers. This advertisement function allows all the routers to learn about all the destination networks that exist and how to to those networks. A router using dynamic routing will 'learn' the routes to all networks that are directly connected to the device. Next, the router will learn routes from other routers that run the same routing protocol (RIP, RIP2, EIGRP, OSPF, IS-IS, BGP etc). Each router will then sort through its list of routes and select one or more 'best' routes for each network destination the router knows or has learned. Dynamic routing protocols will then distribute this 'best route' information to other routers running the same routing protocol, thereby extending the information on what networks exist and can be reached. This gives dynamic routing protocols the ability to adapt to logical network topology changes, equipment failures or network outages 'on the fly' b) Explain any two routing algorithms. 4. a) List out the advantages and disadvantages of DFS. Ans: The files are more widely available since many computers can access the servers, and sharing the files from a single location is easier than distributing copies of files to individual clients. • Backups and safety of the information are easier to arrange since only the servers need to be backed up. • The servers can provide large storage space, which might be costly or impractical to supply to every client. • The usefulness of a distributed file system becomes clear when considering a group of employees sharing documents. However, more is possible. For example, sharing application software is an equally good candidate. In both cases system administration becomes easier. There are many problems facing the design of a good distributed file system, Disadvantages • Transporting many files over the net can easily create sluggish performance and latency; network bottlenecks and server overload can result.

• The security of data is another important issue: how can we be sure that a client is really authorized to have access to information and how can we prevent data being sniffed off the network • Two further problems facing the design are related to failures. Often client computers are more reliable than the network connecting them and network failures can render a client useless. • Similarly a server failure can be very unpleasant, since it can disable all clients from accessing crucial information. b) Mention and describe the various operations performed on UNIX files. ???? 5. a) Explain different types of synchronization. Ans: External Synchronization(Pull Model): A clock is externally synchronized if it has been synchronized with an authoritative external source of time (a UTC source). A system is externally synchronized if all clocks within it are externally synchronized.
In this model, clients request the UTC from a time server. The time server is passive. Assuming that the network traffic is symmetrical, the best estimate for one-way propagation time is (Time1 - Time0 TimeP )/2. The accuracy of this estimate can be calculated if the minimum propagation time is known. This is a suitable technique when the roundtrip times between client and server are short compared with the desired accuracy.
Figure 7.3 Christian’s algorithm, Page number 135

Compensating for Clock Drift After the client receives the UTC, it needs to adjust its local clock gradually. There are two possibilities:  Local clock faster than UTC  Time cannot go backward,  Clock speed is slowed down by software. For example, interrupt routine adds only 9 milliseconds to the software clock if it was normally adding 10 milliseconds before. Local clock is slower than UTC  Increase clock speed Note that abrupt changes could cause problems. It is important to maintain monotonicity. Internal Synchronization(Push Model) : A system is internally synchronized if the clocks within it are synchronized with one another to a known degree of accuracy. For an N processor system, with clocks C i , i = 1, . . . N , and external time source S, external synchronization can be expressed as |S(t) -C i (t)| < D, for i = 1, . . . , N whereas internal synchronization is expressed as |C i (t) - C j (t)| < D, for i, j = 1, . . . , N An internally synchronized system is not necessarily externally synchronized, since the clocks may show collective drift. However an externally synchronized system will be internally synchronized. In this model, the time server is active. It polls every machine whose clocks need to be synchronized. Berkeley algorithm

1. The time daemon of the time server announces to other machines its time and asks for their local times. 2. The machines respond to the time daemon and tell how far ahead or behind they are from the time of the daemon. 3. The time daemon computes the average and tells each machine how to adjust their clocks. Consistency of Time Servers Time servers is susceptible to inconsistencies in their UTCs due to communication delays. In order to reduce the discrepancy between their UTCs, time servers exchange their UTCs periodically by using pull or push models and collect UTCs from other servers; and by using a previously agreed decision criterion they adjust their own UTC. For example by taking the average of the UTCs they have collected. In order to achieve higher accuracy, time servers may report an interval of time (UTC + indicator of inaccuracy) and the averaging procedure can be modified to take this into account.

b) Explain mutual exclusion and the key properties of distributed mutual exclusion. Ans: The mutual exclusion problem is the problem of managing access to a shared resource (data, software or hardware), which can support only one process at a time In a distributed system mutual exclusion needs to be achieved by means of message passing between processes.
The key properties of a useful mutual exclusion algorithm are: Safety: At most one process may execute in the critical section at a time. Liveness: Requests to enter and exit the critical section eventually succeed. This implies freedom from deadlock and starvation. Ordering: If one requests to enter the critical section happened-before another then entry to the critical section is granted in that order. We will see that most algorithms satisfy the first two conditions but not all satisfy the ordering property. Performance of the algorithms is also an important issue and is generally judged in terms of the number of messages needed for a process to gain access to the critical section. The throughput at the critical section will be determined by the synchronization delay: the delay between one process leaving the critical section and the next entering

6. a) Explain the ACID properties in detail. Ans: Atomicity: a transaction must be all or nothing, i.e. to the rest of the system a transaction should appear as an indivisible action, even in the presence of failures. Consistency: A transaction takes the system from one consistent state to another consistent state, i.e. the transaction does not violate system invariants; Isolation: when several transactions are executed concurrently, the intermediate effects of a transaction are not visible to other transactions, i.e. transactions cannot interfere with one another; Durability: once a transaction has committed its effects cannot be cancelled by the execution of an uncommitted transaction even in the presence of failures

b) Explain the problems of : i) Lost update problem: Without proper concurrency control the operations of one transaction may mask the operations of another. For example in this banking example U’s update of account B is lost because T overwrites it ii) Serially equivalent interleaving: A serially equivalent interleaving has the same effect as if two transactions were performed one after the other. If each individual transaction is correct then we can assume that performing one after the other is correct, and thus that the serial interleaving is correct. Refer page number 151 7. a) Explain different types of distributed transactions. Refer to Page 156 - 157 b) Explain two-phase commit Protocol. Refer to Page 157 8. a) Explain distributed deadlock with an example. Refer to Page 160 b) Explain distributed deadlock detection edge chasing algorithm. Refer to Page 161


Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master Your Semester with a Special Offer from Scribd & The New York Times

Cancel anytime.